Analyze robots.txt with Python Standard Library

2023-09-24
If haven’t searched both “python” and “robots.txt” in the same input box, I would not ever know that Python Standard Library could parse robots.txt with urllib.robotparser. But the official document of urllib.robotparser doesn’t go into detail. With the document, you could check whether a url can be fetch with a robot with robot_parser_inst.can_fetch(user_agent, url) if you are building a crawler bot yourself. But if you want to do some statistics about robots. Continue reading