Then, it seems, a good solution to solve the problem is to have server owner to declare in advance what are intended use and what's not. Accessing information without providing the correct password is certainly unintended use, so is guessing passwords. And accessing knowing the password is definitely the intended mode of operation.
A logical step is to make that machine readable. Oh, wait, suddenly this is getting to the server software and configuration, that server developer/administrator had screwed up.
My question is - why we don't make that logical step and simplify things instead of relying on some "should be common sense" and "you should've known you wasn't supposed to do so" completely-gray-area?
Sort of, but in machine-readable form and under well-known location (like /robots.txt) so you could read and comply with them before you access the site.
As for those exact terms, I suspect (IANAL) those exact terms prohibit almost any access to the site, as, for example, they forbid any programmatic access to obtain the information, and I haven't heard of any non-software user-agent implementations.
You can translate "programmatic" as "automated" as in "someone coded a program/tool to, in a programmatic way, access the website and retrieve the data"
As opposed to a human being in a non-programmatic way, opening his browser and accessing the website.
> someone coded a program/tool to, in a programmatic way, access the website and retrieve the data
Doesn't, for example, Firefox, perfectly fit this description? Yes, I do manually enter the base URL to access, but if that's the distinctive feature...
> As opposed to a human being in a non-programmatic way, opening his browser and accessing the website.
... then manually typing in ./scrape.py www.att.com is non-programmatic, too. :)
Or, maybe, I'm not getting the correct meaning of "automated" due to bad English comprehension and false analogies from other languages. But I always thought every request on the Internet is automated and done by some kind of hardware+software combo, so forbidding "programmatic" access is complete nonsense (access control and rate-limiting are the proper solutions).
(And, if that matters, author of scrape.py does not need to conform to AT&T's TOS if s/he don't actually use the script by themself.)
Wait: so before accessing a website I have to go read its terms of use?
What if I set up a website, put a clause saying "you agree to pay $50/page view" in there, and hid it away. Google crawlers will find my site in no time, and then I can start raking the dollars in, right?
A logical step is to make that machine readable. Oh, wait, suddenly this is getting to the server software and configuration, that server developer/administrator had screwed up.
My question is - why we don't make that logical step and simplify things instead of relying on some "should be common sense" and "you should've known you wasn't supposed to do so" completely-gray-area?