I abandoned github when they put code that was not licensed (is: copyright retained) and reproduced it and saved it in their Arctic Vault without the authors consent (mine)
What's wrong with the Arctic Code Vault [1]? Is the only problem that they didn't seek your consent? How is it different to deploying a new availability zone and having your public repos accessible on another server? Your code is preserved verbatim, and it's not possible for GitHub to provide their service without the right to make verbatim copies of your code, which presumably you agreed to as part of their ToS.
> is basically reprinting it without my permission.
What if I were to tell you, that in order to publish any code, on the internet, that code has to be "reprinted" to many different computers and places?
In fact, whenever you yourself need to even access that code, that code is copied over to many different computers along to way, as is necessary to send it to you.
But LTO is fine? I was going to ask if it was because it's not intended as a backup, but that's not even true, this is intended as a backup on a long time scale.
I haven't read this interpretation of the Arctic Vault project - presumably most users of GitHub are okay with their code being reproduced/backed up across many production servers for fault tolerance. Making an 'extra special' long-term backup in the Arctic Vault doesn't seem like a meaningfully different action to me - i.e. using a cloud-based host is essentially opting in to this kind of 'license violation'.
If they had taken one of their existing DB/disk backups and called it a vault, would that have been an issue?
Github does not own the Arctic Vault, there is an independent company behind it [1]. Given its purpose as a long-term archival, it is likely that exemptions to the copyright for (library) archival can apply here. [EDIT: This is probably not true, see the reply for the reason.]
> Github does not own the Arctic Vault, there is an independent company behind it
Github are the ones doing all the archiving. So, in essence, they do own that. Piql are just the ones providing the storage: it's a commercial for-profit entity employed for backup by another commercial for-profit entity.
It is technically true, but the Arctic World Archive specifically "accepts deposits that are globally significant for the benefit of future generations, as well as information that is significant to your organisation or to you individually" [1]. So it doesn't accept any data (at least as far as I see) and the Github archive should also have met this criteria.
By the way, my initial statement that it may qualify for copyright exemptions turned out to be false for a different reason. They only apply when the library and/or archive in question is open to the public, and the Github Arctic Vault isn't. Thus I think it's actually a Github's generic usage grant in the ToS [2] that allows for the Vault. The Copilot is, of course, very different to anything described in the ToS.
...provides prime-rate marketing bullshit in its marketing materials
> Thus I think it's actually a Github's generic usage grant in the ToS
If you refer to Section D.4, then:
- Arctic Vault is not "for future generations", but for GitHub only, since that section doesn't permit GitHum to just make copies willy-nilly for anything other than "as necessary to provide the Service, including improving the Service over time" and "make backups"
- This specifically makes GitHub "the owner" of that data, and not "some third-party" as you originally suggested
If you insist the term "owner" for copyright grants, you have a faulty understanding of copyright. The terms of service, much like software license, only allows for the licensee to do some specific things (in this case, including backups) under certain circumstances agreed upon in advance. Copyright assignment, which is akin to the ownership transfer, is much harder.
> This specifically makes GitHub "the owner" of that data, and not "some third-party" as you originally suggested
This one is my fault though, I've used the "Arctic Vault" as an archival site, but as I later realized it is a Github's archive stored in the Arctic World Archive. So yeah, it's (only) Github that can retrieve the data.
This is a commercial for-profit company, GitHub, taking some code and storing it in cold storage of some other commercial for-profit company, with no one, except these two parties have access to this code. And it doesn't look like GitHub even has the right to do it because it stores it for some purpose other than whatever is stated in their ToS.
I wonder if the whole kerfuffle around Copilot will end up spilling some light on this, too.
How is the Arctic Vault different from any other offsite backup?
I suppose one issue is that you (presumably) can't request deletion from it (which may even be a GDPR violation).
Edit: I looked up the relevant GDPR stuff, apparently there's an exemption for when "erasing your data would prejudice scientific or historical research, or archiving that is in the public interest.", which it arguably includes the Arctic Vault.