Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> “Intellisense with AI now available on VS Code” is Microsoft’s nice way of saying “We harvest the shit out of your data”.

What? The IntelliCode description says that "Contextual recommendations are based on practices developed in thousands of high quality, open-source projects on GitHub each with high star ratings.". I'm not clear on how you think Microsoft will harvest data from doing this - much less how it is related to ads.



There is telemetry built into VS Code that you can turn off, but it is turned on by default. There is no way to know how they're using that data. Microsoft also collects data on Office365 platform and even local Office 2019 installs.

I was trying was to make a tongue in cheek point, rather than specifically digging into exact methods of how they collect data. You may be right but it is besides the core issue we are discussing here.

At the end, it is irrelevant how they're collecting data - all you need to know is this: https://about.ads.microsoft.com/en-us


We don't do any training on your data. Telemetry is largely for feature use tracking (eg we should really do better around this component, everyone is using it). There is absolutely no ads overlap at all. Literally none.

Disclosure: I work at Azure but not on VS Code


A big tech company harvesting extensive user data by default and saying it's ok because "Trust us, we don't do anything bad with it" is not a compelling argument these days. Even if it's true (and that's a big if), that says nothing about what Microsoft will do with the data tomorrow, or the next day.


It's on by default, but literally the first thing we ask you, actively, is about giving us feedback and let you turn it off right then. It's not remotely hidden.

I totally understand, though about trust. We've been asking users for feedback for 20+ years, I hope we've shown ourselves to be good actors!

Disclosure: I work at Azure on machine learning


> We've been asking users for feedback for 20+ years, I hope we've shown ourselves to be good actors!

Not sure about the Azure team (since my experience with Azure is limited to a one-off project that required a SQL Server instance to test compatibility with some client's legacy DB, plus some very recent dabbling in VS Code and Azure Data Studio), but whichever team runs Windows nowadays, for example, hasn't exactly "shown [them]selves to be good actors". Or is preinstalling Candy Crush even on "Professional" versions of Windows 10 considered being a "good actor" by today's standards?

(Sure, even the first versions of Windows had games preinstalled, but those were first-party instead of third-party, they ostensibly existed to help users learn how to use a mouse, and they didn't plaster themselves front-and-center on my Start menu like seedy massage parlors in a Bangkok red light district)

Needless to say, "we've been siphoning your data for multiple decades, so you can trust us to keep doing so even though we're actively betraying that trust at this very moment" is cold comfort at best. I would hope that the New™ and Improved™ Microsoft™ would at least have some self-awareness about that.

I'm sure your intentions are good and pure. I hope they stay that way. I have no way of knowing whether or not they will, and historical precedent says they probably won't. It shouldn't be surprising that I and plenty of other tech-savvy users would take such claims of trustworthiness with a baseball-sized grain of salt.


I should have said - "be good actors in relation to how we handle your data."

According to the Windows GDPR statement (https://docs.microsoft.com/en-us/windows/privacy/gdpr-it-gui...), we only collect Windows functional data and Windows diagnostic data, and only if you allow us.

If we did more than that, we'd be liable for huge fines/jail! We have never "siphoned your data for multiple decades" - we only collect WHEN and ONLY WHEN you let us.

The way that our intentions haven't changes is because we are legally required to tell you if they do!


I know a lot of people like to claim that the US is like the EU, but this is the first time I've heard someone claim that the US is literally in the EU.

Less sarcastically: the GDPR is not applicable in the US, so hiding behind that does me no good. If your only defense is "but the GDPR doesn't let us do that!", then that's deeply concerning and only proves my point.


You keep saying trust the GDPR, but what about us on the other side of the pond? GDPR gives us zero recourse if you do run afoul.


@ThIronYuppie Let me say first of all that I really appreciate your sincerity and belief in the Goodness of the company that you work for. The issue is not insincerity, but:

* a lack of trust in corporations in today's economic and political climate and

* the skyrocketing growth of the online advertising business, which has made user data quite literally the gold mines of the digital age

The incentives today are very very well aligned for a corporation to sell user data. Even if a company may mostly be made of good people, without explicitly written contracts or declarations of any kind, a Company can change internal policies rather quickly. Maybe there will be a manager who sees the selling of that data as a way to boost profit and rise within the company. Maybe the company sells that division along with their data.


You say the issue is not insincerity but I beg to differ. Microsoft has never been particularly sincere about the reasons it does the things it does.

It has been a deceitful bully of a company for a long time.

Indeed, one of the main reasons why there is so little trust in corporations is because they are so insincere, anti-consumer and lacking in any kind of social conscience.


I get it - I do. I'm a recent returnee to MSFT, after many years working not at MSFT, and I remember what it was like.

We can't rewind and fix the past, but we absolutely can be better working forward, and we will.


I totally get it. We're a big company, and you're right to question EVERYTHING. All I can say is we'll try and earn your trust every day and laws like GDPR make sure that we're on the hook to do so. Put another way - we CANNOT change this without EXPLICITLY notifying you.


s/gold/oil/


> I totally understand, though about trust. We've been asking users for feedback for 20+ years, I hope we've shown ourselves to be good actors!

I don't believe you have, sorry (Microsoft I mean, not you specifically). I get the impression Microsoft is the same company it always was, only now it's better at convincing the gullible that it is on their side.

That Windows 10 update prompt for previous Windows versions were obnoxious and downright deceitful, making the 20+ years old traditional 'close window' X in the corner of the window actually download the update instead of doing what users expect it to do.

That's before we even get onto the subject of Windows 10's draconian privacy policies and use of dark patterns all over the OS designed to get people to disable their privacy settings.

Of course, even if they don't fall for tricks like coloring the 'privacy customisation options' link during installation a slightly different colour of blue than the background of the window to make it deliberately hard to read, they will still likely have all their privacy options reset and have previously removed crapware like Candy Crush added all over again in a forced update.

I'm on a Pro version of Windows 10 and I can't even disable the telemetry and constant phoning home every time I launch an application and or encounter a problem (that I absolutely do not need help with) with something I'm running.

I could understand this crap with the Home version. Not on Pro which I fully expected to have a degree of control over like I did with Win7 pro.

As a result of all this, Microsoft finally made my shit list after decades of shady behaviour and anti-competitive business practices and I won't be buying or using anything by them going forward.

Hell I even started PC gaming on Linux, that's how much I dislike the things Microsoft does.


I'm a little late to this party, but I couldn't agree more (except that I'm in the fortunate position of making decisions about my businesses, so we can decide not to use Windows 10 at all for much the same reasons).

We also don't use VS Code at all, for one simple reason: we tried to determine what the privacy policy actually was, and we couldn't. In particular, there appeared to be wording among the various indirectly linked documents that implied Microsoft might at some point upload our source code without our knowledge or consent, with no guarantees and nothing to indicate how it would or might then be used.

Combine that with an organisation that has a recent deliberate strategy of pushing updates whether wanted or not, collecting data whether volunteered or not, and attempting to coerce or deceive users into accepting those things whether it's in their interest to do so or not, and unfortunately while certain people who work for Microsoft and comment here may have no ill intent, it simply isn't safe to trust the company as a whole with the same benefit of the doubt.


Good news! The privacy policy is specifically listed in our FAQ on our website - https://code.visualstudio.com/docs/supporting/FAQ

We _only_ collect telemetry data. NEVER user data. And you can opt out! (it's right there during set up - or you can go here - https://code.visualstudio.com/docs/supporting/faq#_how-to-di...).

Again, to reiterate, we would never collect data if you didn't consent. It's the law!


Good news! The privacy policy is specifically listed in our FAQ on our website

Unfortunately, if you go down that rabbit hole (which we did before) it follows various redirect links and ends up at https://privacy.microsoft.com/en-us/privacystatement, which is a generic document that changes frequently and includes by reference ambiguous additional product-specific documents that may or may not exist.

What you need here is an absolutely clear, unambiguous statement to the effect that you will never under any circumstances upload our source code or other proprietary data to any external system without our explicit opt-in. It's really that simple. Otherwise, I'm afraid those of us working under commercial confidentiality agreements or other legal controls are just going to run away.

Again, to reiterate, we would never collect data if you didn't consent. It's the law!

Perhaps you could explain to me how to turn off the telemetry in Windows 10 then? Or for that matter how consent was obtained for the telemetry that was silently added to earlier Windows versions after release by anyone who installed Microsoft's recommended updates?

I appreciate that your personal intentions may be honest and good here, but the simple fact is that your organisation has a very clear, very bad track record at this in recent years, and its senior leadership has not only been entirely unretentant about that policy despite widespread criticism but actively doubled down on it. Anything Microsoft does will naturally be contaminated by that history now, unless as a minimum it makes a clear, legally actionable statement and/or imposes verifiable technical measures to guarantee different behaviour.


From the linked FAQ:

VS Code collects usage data and sends it to Microsoft to help improve our products and services. Read our privacy statement to learn more.

If you don't wish to send usage data to Microsoft, you can set the telemetry.enableTelemetry setting to false.

From File > Preferences > Settings (macOS: Code > Preferences > Settings), search for telemetry.enableTelemetry and uncheck the setting. This will silence all telemetry events from VS Code going forward. Telemetry information may have been collected and sent up until the point when you disable the setting.

That's ALL telemetry. So, the second you don't actively opt-in, we collect no telemetry data in VS Code AT ALL.

You'll also want to disable crash reporting:

VS Code collects data about any crashes that occur and sends it to Microsoft to help improve our products and services. Read our privacy statement to learn more.

If you don't wish to send crash data to Microsoft, you can set the telemetry.enableCrashReporter setting to false.

It's that simple. Absolutely nothing - and CERTAINLY no user code (which we never collected in the first place.


Thank you for the further response. I'm not sure whether this is your area at work or you're just trying to help here, but as further feedback in return, that reads like an opt-out scheme to me. I'm also immediately struck that this relies on a setting (which Microsoft products have a history of changing when installing later updates) and that we still haven't resolved whether VS Code has its own separate privacy policy (which is mentioned as a possibility in the generic Microsoft privacy statement we were discussing before, without any specific indication of how to determine the answer definitively or where it would be found if it exists).

So again, while I appreciate that individuals involved may have honest intentions and be trying to help here, this is still a very long way from the kind of clear, unambiguous official statement that would make me trust any Microsoft product enough to use it in the current data-harvesting, forced-updates climate. I have specific legal obligations to clients when dealing with their source code and the proprietary knowledge implicit within it, and there's no way I can take this sort of documentation to my lawyer and say "Can I use this?".


You can read more about exactly how the IntelliCode feature uses telemetry and data here:

https://docs.microsoft.com/en-us/visualstudio/intellicode/in... - No user-defined code is sent to Microsoft, but we do collect information about your use of the IntelliCode results.

For base model suggestions, which are open source or .NET types and members, we capture whether you selected an IntelliCode suggestion and log the name of the suggestion. Microsoft uses the data to monitor the quality of the base model. For custom models, we capture whether you selected an IntelliCode suggestion but do not log the names of your user-defined types or methods.

Thanks Mark Wilson-Thomas Program Manager Visual Studio IntelliCode


Thank you for the insight. Would you know if this is written somewhere on the VS Code homepage or privacy policy? I think it should be highlighted!



Is there a way we can see what data specifically is being sent to your servers from our local VSCode installations?


(Disclaimer, work at Microsoft. Not on VS Code.) I expect you should be able to run Fiddler or Wireshark or similar traffic sniffers to see the requests.


At least in theory, it shouldn't be necessary to go to such an extreme, considering that VS Code is ostensibly FOSS and thus readily auditable for this sort of thing: https://github.com/Microsoft/vscode

This assumes, of course, that y'all aren't doing any weird code-injecting funny business when packaging it up for installation :)


I presume these requests are encrypted? So, it is hard to know what exactly is being sent.



What people want to hear is there is none, and there never will be. Even that isn’t enough, but at least there is some legal recourse.


We need to stop relying on the assertions Big Tech make about themselves, and work towards regulation.

What most viewers saw during the Facebook congressional hearing was woefully out of touch, forgive me for using this phrase “old white guys” questioning Mark with many facepalm-able moments. But the house hearing, which includes 40 representatives from the Millennial generation, who grew up on tech, asked much better questions. It just got swallowed up by the news cycle because the Manafort Raid was happening at the same time as hearing about the Cambridge Analytica scandal was occuring.

All of this is still new. Banking, Finance, Real Estate, those are all well regulated at this point and arguably much more sophisticated or complex. The situation we have now with the Googles, Amazons, Microsoft, Facebooks of the world packaging up our metadata, combining it with other third party data, training AI, etc is all new, and those who are knowledgeable about it are just now making it into congress, so I am optimistic we will start to see real regulation on these shameful business practices soon.

Roger McNamee had an excellent conversation on the Sam Harris podcast recently about all of this. It was a very good conversation and one I highly recommend listening to.


Can you say more? Like there is none what?

If there are any changes to this doc - https://docs.microsoft.com/en-us/microsoft-365/compliance/gd... - you would be able to check that immediately, and we would never change our data collection policy without changing that document... It's the law!

Disclosure: I work at Azure on machine learning but not VS Code


"We specifically make the UE4 EULA apply perpetually so that when you obtain a version under a given EULA, you can stay on that version and operate under that EULA forever if you choose." https://twitter.com/TimSweeneyEpic/status/108340919447950541...


He means no telemetry. None whatsoever. At the very least a default opt-out.


You got it! https://code.visualstudio.com/docs/supporting/faq#_how-to-di...

(we ask both during install, and you can disable at any time)


Is there any way we can verify this?


If you can't trust Microsoft employees saying it, then it's most likely you won't be able to trust anything short of running Wireshark to verify the outgoing data, no?


For better or worse, here - https://docs.microsoft.com/en-us/microsoft-365/compliance/gd...

However, if you don't trust us (and I get it!) Your best bet is just to opt out which you can do at any time, including during install.

Disclosure: I work at AZure on machine learning but not VS Code


If in the EU you could ask for all collected data under the GDPR. [ed: all personal data - so there might be some set of aggregated, non-identifable data that they might not be required by gdpr to allow insight to - that is however a pretty high bar to clear (and intertwines with being allowed to sample / collect that data in the first place, informed concent, defaulting to not collecting data (opt-in) etc]


Whew! For a second I thought MS might be selling information on what type of variable names I use to advertisers.


Nah, it's more like automating you away once significant part of your coding process is replicated by AI in the future and then selling your replacement/virtual clone to your employer at a lower cost :-P I guess at some point there will be a replacement/clone store and any company could add 'mhermher'-style coding to their own projects by purchasing clone there. Internally ranked and price exponentially rising the more capable replacement is (e.g. ACM ICPC winning replacements will cost millions).


I'm a Windows desktop user, XP, 7, and 10. That's what I want, a DESKTOP. I don't have a smart phone. I may soon buy a $20 mobile flip phone. I have nothing from Apple.

For my own general purpose computing, I want Windows desktop especially with its backwards compatibility.

For business, I'm doing a startup which for the users is a Web site, and I wrote the code on XP and will run it on 7 as my server until I switch over to Windows Server.

I've done nothing with Unix or Linux.

In the past 10 years, I've typed in about 400,000 lines of text as software, with about 100,000 program language statements. For the compiled code, it's essentially all Visual Basic .NET with ASP.NET for Web pages and ADO.NET for getting to SQL Server database. I like VB.NET. I did my startup Web pages with ASP.NET -- seems fine to me although they are my first Web pages and I;m no Web page expert. E.g., I wrote no JavaScript although ASP.NET writes a little for me which is optional.

For a cloud, for now I'd be concerned about cost, startup time, and security. Later I'll be concerned about security.

I tried Visual Studio for a few minutes and gave up on it. I've used no integrated development environment and have no desire to.

My two most important tools are my favorite text editor KEdit and my favorite scripting language Rexx. I type my code into KEdit and have a lot of KEdit macros to help me with the code. The code for my Web site is about 100,000 lines of typing and about 24,000 programming language statements -- I had no trouble debugging and never wanted anything like Visual Studio.

I wouldn't use Visual Studio for free -- too much botheration for too little need.

When I looked at Visual Studio, I could find no reasonable, usable, competent documentation, and no way do I want to take out months to figure out like a puzzle and document Visual Studio for my own use. E.g., Microsoft keeps talking about "intelisense" as if I should already know what that meant and would like it. Of course, I can't look up intellisense in a dictionary, and absolutely, positively, with feet locked four feet down in reinforced concrete will Microsoft refuse to document, describe, and explain what they mean by intellisense. It is as if their gibberish not in any dictionary has self-evident meaning and value -- it has neither. Grotesque, outrageous, inarticulate, incompetent, sick-o communications.

It turns out, ASP.NET is super easy to debug -- just give the .ASPX file to a Web browser and let ASP code do its things.

For Office and e-mail, I use my legal copy of Office 2003 with Outlook. Now that I have good, extensive notes on how to adjust the settings and options on Outlook, I can do the setups in an hour instead of the several days as before; it's fine. There are some improvements I could think of, but I doubt that Microsoft would be interested. It might be that I could program the improvements with the old VBA which might be able to read and parse an Outlook PST file, but so far I've never tried VBA and when I did want to try it didn't have a copy. For Excel, I use it to draw simple graphs and otherwise regard it as worthless -- I'd much rather write code in Rexx, VB.NET, Fortran, PL/I, etc. and then use Excel to draw the plots. My understanding is that now there is much better graph drawing software, likely with API's, or better API's, than Excel.

For high quality word whacking, I use D. Knuth's original TeX and love it. I hate Word -- used it some, got okay with it, but hate it.

Lessons: To me, for my personal computing and for business, I want the full power of a good desktop computer. I place high value on backward compatibility, e.g., back to Office 2003, an old Watcom Fortran, an old IBM OSL (Optimization Subroutine Library) to be called from Watcom Fortran, KEdit, Open Object Rexx, etc. For business and software development, VB.NET is fine.

Far and away my greatest gripe with computing, the computer industry, and Microsoft is documentation -- on average, the quality of the documentation is awful. To me, the biggest problem in my startup, by far, is poor documentation. Broadly the poor documentation has commonly taken me 100-200 hours to do things I should have been able to do in one hour. So, I DO write my own notes and then AM able to do the stuff in one hour. The bad documentation is close to killing my startup. Long since I should have been sending six figures a year to Microsoft for licenses on their software, and the main reason I'm not is their documentation. E.g., it took me two weeks of full time mud wrestling JUST to find a connection string that worked with SQL Server -- should have taken 10 minutes. Recently I spent 80 hours full time getting simple file sharing, as a first time user, between my 7 and 10 systems. I wrote up notes for myself that will solve the problem for any first time user in less than an hour. I posted the notes on TechNet. Responses from others were awful, e.g., kept talking about Workgroups and Homegroups and some Windows password tool, ALL of which are just irrelevant down to next to useless. Apparently nearly no one still actually knows how to use the simple, well designed, command line NET commands to set up first time user file sharing. The old NET documentation is also awful, gets a grade of flat F in just simple Bachus-Nauer syntax notation 101 and totally omits anything about semantics, meaning, usage, understanding, security, consequences, timeouts, etc. I had so much trouble with even simple things with SQL Server that eventually I got help, really simple answers, from some high up SQL Server executive.

To me, the most serious problem blocking information technology, computing, Microsoft, and my work is BAD DOCUMENTATION.

When Microsoft learns how to describe their work, then I'll start to consider if they are a competent, functioning company with a bright future.


Fuzzy logic, hamming scoring, and markov chains suddenly became "AI" now...

They don't need your telemetry for that


I'm pretty sure these approaches were considered AI from the very start. If anything, algorithms stop being referred to as intelligent over time, as our expectations of what is possible grow.


Yawn. "AI" has always, and probably always will be, a marketing term to sell other algorithms.


> "Contextual recommendations are based on practices developed in thousands of high quality, open-source projects on GitHub each with high star ratings."

Wait, what? Is there a way to opt out of that in GitHub? I don't want their AI to scan my code.

Realistically, since most of my GH code is Common Lisp with few stars, I doubt it's looking anyway, but I'd like to make sure.

I'm sure it's covered in their TOS, but as a (still) paying GitHub customer I don't want them to do that with my repos.


Should have nothing to do with github and everything to do with you putting your code on the internet with a permissive license. Google or Eclipse could do this too (and I expect they do, since it's the best corpus of code available).


Google does, in a way, with the GitHub data set in BigQuery -> https://cloud.google.com/bigquery/public-data/#sample_tables


Don't make your repos public if you don't want people reading them


I disagree with the paren't sentiment but I think they raise an interesting point: is/should there be a license that says: you can read/use/modify {this} but you cannot use it as a training data for the AI.

{this} can be code or text or an image or any other content.


Genuine curiosity... why have you put an apostrophe in "parent"?


attempting to write "parent's" before coffee


Just change the license file until GitHub doesn't recognise the license any more.


Is it publicly hosted on Github with a permissive license?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: