Did Claude increase bugs in rsync?

GodelNumbering · 2026-06-05T22:32:08 1780698728

Was just looking at commits and came across a commit and its revert

original commit: https://github.com/RsyncProject/rsync/commit/d046525de39315d...

```

- if (!ptr)

- ptr = malloc(num * size);

- else if (ptr == do_calloc)

+ if (!ptr || ptr == do_calloc)

   ptr = calloc(num, size);

```

Written with claude. This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade. For large and recursive allocations, this becomes a significant cost.

reverted in https://github.com/RsyncProject/rsync/commit/7db73ad9a1b8721...

if you read the description of revert half carefully, it's easy to tell that even that was written by an LLM .

I can understand the sentiment of whoever posted the original thread.

wolletd · 2026-06-05T23:03:18 1780700598

Also the amount of commits is suspicious. In the last two months, rsync had about as much commits as in the last two years before that. Most of them written with claude. And then stuff like this is in there.

That's exactly what I'd expect when someone is excited about AI usage and becomes... well, sloppy.

logicprog · 2026-06-05T23:20:21 1780701621

Tridge already explains this:

"Like many developers of open source packages I’ve been hit by a flood of security reports lately in my role as the rsync maintainer. Many of those reports are AI generated (not all though, there are some notable ones with very careful and high quality manual analysis).

As this flood started to get more intense I realised I needed to raise the defences on rsync a lot — we needed much more thorough test suites, code coverage analysis, CI testing on a lot more platforms, deliberate and thorough scanning for possible security issues (so I find at least some of them before other people!) and the addition of a whole lot of defence-in-depth hardening techniques. This is all a huge amount of work. "

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

tom_ · 2026-06-05T23:26:59 1780702019

AI multiplied by Linux overcommit. What times we live in!

(My own view: 10.8 GB is nothing these days. Your sprintf buffers are probably larger than that. (And if they aren't: they should be. That, or you should start using snprintf...))

scottlamb · 2026-06-05T23:05:29 1780700729

> This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade.

I wouldn't assume Claude made that decision; it's not as if that was some incidental thing that it snuck into a large commit. The commit message starts with "zero all new memory from allocations", and that's exactly what the commit does. What do you imagine the prompt was?

It seems totally plausible to me that a human initially thought this was an improvement, then rethought after discovering the RSS regression. And it's not a law of nature anyway that this change has to increase RSS; calloc could special-case the case in which memory was freshly returned from the OS, knowing fresh memory mappings are zeroed anyway.

I blame AI for these regressions mostly in the sense that it caused a flurry of vulnerability reports. Those led to a flurry of quick fixes. Sometimes quick fixes cause other problems.

delusional · 2026-06-05T23:08:45 1780700925

You don't really have to guess. The guy told us the AI didn't suggest this specific change:

> The change to zero memory was my idea and my change. It was a reaction to a security report I got which caused use of an element past the end of an array. By zeroing the allocation I could ensure that misuse of that memory if a similar bug came up in the future could only cause a null ptr deref, which is better than the chance of a valid pointer. It got a claude co-authored tag on it as I got it to do some tidy ups of a series of commits, and that is just what it does when it makes any modification. It doesn't mean the change was written by claude. It was written by me.

https://github.com/RsyncProject/rsync/issues/959#issuecommen...

jagged-chisel · 2026-06-05T23:26:03 1780701963

> … By zeroing the allocation …

How does that prevent reading past the end of the buffer? Or change how bytes outside the buffer are used? Are these arrays of pointers so that the “null ptr deref” comment makes sense?

Or am I the bozo and don’t know what’s happening here?

GodelNumbering · 2026-06-05T23:25:32 1780701932

okay I had not read this or any discussions there (except the one linked in the post), but this looks weirder. the comment you linked is a dev responding to what is very clearly a bot comment. I am sure they have good intentions and I have no reason to believe otherwise as I have no connection to the project whatsoever, but the original commit being 4-5 lines long (what did claude do then?) and the revert description is almost certainly written by an LLM makes in my mind the slop argument stronger.

I hope if this doesn't come across as unkind towards the dev who gives their time and energy to the project. Grateful for that.

jarym · 2026-06-05T23:17:52 1780701472

I've been coding for over 2 decades. I love it, I've always loved it and I likely always will.

I was an AI skeptic some months ago but truly Claude and Codex have changed my development style and velocity in a way I never imagined would ever be possible. With that, yes, I produce more code and am finding more bugs.

So looking over at comments in HN articles the amount of polarising hate to anything produced with AI is quite surprising. Just because some AI helped or even produced entirely doesn't suddenly make a project 'vibe coded' as if that's meant to be some insult levelled at users of LLMs.

It reminds me a lot of when offshore outsources started getting more software development work from the mid-90s with all the derogatory remarks made towards 'Indian developers'. Now we're in the mid 2020s and similar remarks are made towards AI.

I don't get it. I really don't. What I do know for sure is more and more code will be AI generated with or without the detractors.

nomel · 2026-06-05T23:28:56 1780702136

I've always noticed, within any subject involving tools, there are people who like the tools, and some people who like to use the tools to do something else.

With programming, I've always been in the later: it's a tool that allows me to do what I actually love, which is problem solving, system level thinking, and providing some nice solution to that problem, that happens to be through software.

So, I have an absolute blast with AI, because it helps do the more boring bits. And, seeing my non-programming colleagues get excited to see their vibe coded ideas become reality has been so much fun.

I'm genuinely curious to hear the perspective of someone anti-AI, who works in software. Perhaps the impending doom/skill shift of our profession?

Joel_Mckay · 2026-06-05T23:34:05 1780702445

LLM are good for context search, and template output.

However, you also get the lowest common salient answer guaranteed, uncopyrightable work (differs from public domain), and potential legal peril from copyright bleed-through.

We are in the golden Napster age of isomorphic plagiarism. =3

RustyRussell · 2026-06-05T23:21:27 1780701687

For those commenting, I suggest you read the post linked by the rsync author:

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

(Disclosure: while I haven't talked with him in years, Tridge was my colleague and mentor for many years. I feel it is worth considering his view before joining a crusade)

gravypod · 2026-06-05T23:37:33 1780702653

This is a really cool post but I think one metric we may want to also look at is does using agentic coding tools in one domain impact your coding abilities in another domain? A lot of people I know have been talking about getting rusty on the fundamentals recently. This is not something I am particularly feeling as I do a mix of running agents in parallel and writing some code manually where it makes sense. But if people who have been prompt-only at work come home and work on rsync and are more "rusty" maybe that could also lead to more bugs?

This would be even harder to measure.

aesthesia · 2026-06-05T18:41:54 1780684914

I don't have a dog in this fight, but a few points that look a little suspicious:

- The release with the highest number of attributed bugs is the release _right before_ the first release with Claude-coauthored commits, released in January; is there a chance that unattributed LLM-authored commits made it into this release?

- The release attribution methodology is not great, since it will tend to attribute bugs introduced in a minor version update to the longest-lived patch release of that minor version. I doubt that 3.4.1 actually introduced a lot of bugs, but since it was released a day after 3.4.0, bugs that were introduced in that release get attributed to 3.4.1.

- Relatedly, more recent releases have had less time to have bugs filed against them, so there may be a bit of a bias toward evaluating recent releases as less buggy.

theteapot · 2026-06-05T23:15:26 1780701326

Agree. From the article:

> Here's my favorite part, though. Digging into the data, one of the first things that jumped out at me with blinding clarity was that the worst release, by far, in rsync history was entirely prior to the introduction of Claude ... And yet nobody noticed.

Language really does suggest the article's author does have a dog in this fight and is cloaking opinion in fancy statistics jargon. "Blinding clarity"? All you have to do is draw a plot. And anyway, v3.4.1 was 2025-01-16, technically well within the AI assisted coding era and before attribution was becoming standard practice.

OptionOfT · 2026-06-05T18:59:23 1780685963

You can use LLMs in multiple ways, from very hands on to make local changes to completely hands-off.

I've seen plenty of code that was LLM generated but the commit message itself did not have the co-author attached to it. This only seems to happen when someone's interface to the codebase is completely though Claude/Codex/..., and those are usually the most verbose commits, and yet they say the least, because they just summarize the code changes, not the why.

On the other hand I've seen developers using Claude as a tool. They have VSCode open and a terminal window with Claude and go back and forth, ensuring they write correct code, and leave the plumbing to Claude.

So maybe the author of the code started off small and it grew over time?

hparadiz · 2026-06-05T22:26:46 1780698406

I would expect a mature code base like rsync to have a lot of unit tests and integration tests and frankly if there's not enough that such bugs haven't been caught; that should be your first use of LLMs in order to setup some deterministic guidelines when you do start making changes to your actual code.

I have been experimenting with both aforementioned styles with interesting results.

cyanydeez · 2026-06-05T22:42:13 1780699333

I've had a local LLM spending weeks trying to write tests. then debug those tests. then write antipatterns and patterns for those tests.

It's amusing. It's not terrible, but tests arn't going to save you from a malicious tester.

logicprog · 2026-06-05T18:51:19 1780685479

Your first and second points seem to contradict each other because if all of the bugs for 3.4.1 should be attributed to 3.4.0, that pushes the timetable back even further that unattributed LLM commits would have to have been being committed to the project, which just makes your point even more absurd.

Which brings me to my overall response, which is that there is absolutely no evidence, and nothing even intimating this hypothesis, that LLM commits were secretly being added to earlier releases before they were attributed, and that's why the rate of bugs is higher. There's no reason to think that it's an unreasonable thing to think, and there's no evidence for that whatsoever unless you beg the question and assume that higher bug counts must automatically indicate AI involvement, which is just circular reasoning. You're essentially just making up a hypothesis out of thin air to preserve your point.

Regarding your third point, that one's fair, but I've done the analysis and I can put it up if you want, as to how long it usually takes to find bugs and how far through the release cycle we are for each version.

aesthesia · 2026-06-05T19:10:41 1780686641

Sorry, I should have said this explicitly in the original comment: I think you're likely _correct_ that there isn't a clear increase in the rate of bugs attributable to LLM-authored code in rsync. Your analysis provides evidence in this direction; these are just the things that made me go "hmm". They're not accusations or claims that the conclusion is invalid. But they're definitely things to be curious about.

Regarding unlabeled LLM-authored commits, I don't think it's unreasonable in general to think that an open-source project might have had unlabeled LLM-authored commits at some point before 2026. Looking more closely at rsync's recent commit history, I think it's less likely in this case. There's just a low number of commits in general, _until_ large batches of Claude-authored commits start showing up early this year. But this then raises some questions about the bugs-per-commit metric; it does correct for something like "size of release", but also obscures a significant shift in commit velocity that may be downstream of adding LLM development tools to the workflow.

Like I said, I don't have a dog in this fight, and I try not to approach sorts of questions from a position of explicit advocacy. I do think it's an interesting question, though, and we should try to understand what the data is actually telling us.

jonquark · 2026-06-05T19:08:31 1780686511

Isn't the metric that you've used "bugs per commit ~ per new line of code" going to miss the issue?

All code is technical debt.

If rsync releases used to have 500 lines changed and 5 bugs in and AI-powered rsync releases have 50000 lines and 500 bugs, it's the same bugs/line but much worse experience for the user?

I've not looked into the details of this case and I do use AI assistance coding at work but in my experience, the problem is that it's too easy to write lots of code and therefore hard to review the huge volumes of code and this analysis will ignore that?

edit: actually your table shows there weren't unusually large numbers of commits in this release, so perhaps my initial skepticism shows a bias I have?

PunchyHamster · 2026-06-05T20:18:32 1780690712

Let's start with most outright alarming error - the claude statistics are taken out of whole 2 data points

logicprog · 2026-06-05T20:20:11 1780690811

That's sort of the point. There isn't enough data to extrapolate, and yet that's exactly what those outraged about AI were doing, and when you do do the very minimal types of analyses (permutation tests, and looking at distributions, mostly) that are actually valid, safe, standard, and useful to do on such low amounts of date, again, no evidence for the outrage shows up, and the two releases look so normal that it sort of shows no one would've cared if they hadn't known or found out that Claude was involved.

I really think this a much better standard of evidence — limited though it is — to outrage-fueled cherry-picked anecdotes, which is what has been driving this whole thing. If you disagree, and think the outrage should go one when I've shown there's an absence of evidence entirely for it (although of course, that's not evidence of absence; maybe I'll have to eat my words 5 releases down the line, but appealing to that now feels like a Russell's Teapot), would you care to explain why?

ofjcihen · 2026-06-05T20:38:26 1780691906

I know you’re defending your work here but this behavior does absolutely nothing to help your point.

logicprog · 2026-06-05T20:48:55 1780692535

Fair point. Let me edit (if I still can) to tone it down.

runarberg · 2026-06-05T20:24:37 1780691077

The interpretations of the p-value is also alarming. One of the first thing they teach you in statistics class is: “an absence of evidence is not evidence of absence”.

This analysis showed that there is indeed an absence of evidence, but it concludes there is evidence of absence.

Traditional p-hacking is done by oversampling and overtesting. If you do 20 analysis on average one will show p < 0.05 by random chance. This analysis is doing the inverse of that. Under-sampling, and concluding with p > 0.05

logicprog · 2026-06-05T20:48:42 1780692522

> This analysis showed that there is indeed an absence of evidence, but it concludes there is evidence of absence.

I tried pretty hard to avoid saying that, can you point me at how to rephrase? The point I'm trying to make is just that there is absolutely no evidence at all for what people are saying with such absolutism and claimed objectivity (that Claude made rsync worse), and thus it doesn't justify the outrage.

> Under-sampling, and concluding with p > 0.05

How would I avoid under-sampling here? And if you're going to say it's because I only have 2 data points, well, the side making the positive claim — that Claude made rsync worse — only had two as well, and unremarkable ones at that, as I've tried very hard to show.

runarberg · 2026-06-05T20:58:22 1780693102

You are interpreting the p-values on their own merit rather then using them to test a null-hypothesis. Quotes like:

> With a p-value of 74%, the answer is a decisive no. The odds ratio is 1.06 — essentially 1:1. Claude releases are no more likely to be above the median than any other releases.

are problematic in this context as the correct conclusion here is you just don‘t have enough data conclude whether or not you are more likely to encounter a bug after a Claude commit.

> How would I avoid under-sampling here?

You don‘t. You admit that you don’t have enough data and move on. What you are trying to do here is prove a negative, which is extremely hard to do. In your discussion you claim that the users complaining had no right to, however nothing in your analysis showed they were wrong. We simply don‘t have enough data (yet) to say either way. When we have enough data they may be proven right or wrong, but until then, we cannot conclude either way.

If you insist still, I recommend looking into bayesian analysis. Theoretically at least the posterior distribution from a bayesian analysis can be interpreted directly and analyses on its own merits. However I suspect your posterior will have way too much uncertainty to reach any conclusions.

logicprog · 2026-06-05T21:37:42 1780695462

Edited that claim, and made several clarifications elsewhere. The whole point of this analysis is that outrage is unjustified on the basis of two totally statistically unremarkable releases that no one would have remarked on pre-AI (my further proof of this is that there was a pre-AI remarkably broken release, and no one did comment!) and zero positive evidence outside cherry-picked anecdotes for any negative impact. We should wait for outrage and version pinning and cancelation until there is evidence, no? I'm just trying to say that these specific releases are unremarkable, and there's no evidence at all of harm currently; I'm not trying to build any kind of predictive model for future Claude releases to say anything grander than "these specific releases are fine, what are we freaking out about?", not some claim about what Claude-exposed releases will look like or trend like in the future or in general.

xmddmx · 2026-06-05T21:28:07 1780694887

The concept you need here is "Statistical Power".

The ELI5 version is that there are two mistakes you can make when looking at a P value:

Type I error, where your P value is falsely low. In the experiment being discussed here, it would lead one to conclude that AI code is worse. Otherwise known as a false positive.

Type II error, where your P value is falsely high, leading you to conclude that AI code is no different. Otherwise known as a false negative.

https://en.wikipedia.org/wiki/Power_(statistics)

One can calculate statistical power for a given experimental protocol.

My hunch is that if you did this, you would find this experiment is grossly under-powered.

This means you can't make the "absence of evidence" claim.

cobertos · 2026-06-05T23:32:50 1780702370

This post just gives me more questions than answers and I'm unable to form a decision:

* Why was v3.4.1 the most buggy, right before the Claude commits? Why did "nobody notice"? It's way to strange to just say welp, it must be human error. * Why does v3.4.2 have 0 bugs, or 0 bug score. And why was such an outlier (no other commit seemingly has this??) allowed to mix into aggregate statistics and bring all the "is Claude buggy?" scores down. Tbh idk how that _wasn't_ a red flag in the author's analysis...

This article feels like half of an analysis presented as a highly complex finished product due all the advanced stats they're running.

thorum · 2026-06-05T18:29:18 1780684158

Unfortunately for the people mad about this, I predict the only thing they will accomplish by pressuring the rsync maintainers, is to discourage everyone else from responsibly disclosing their use of AI. You’re just going to make people disable Claude attribution on their commits to avoid drama.

zzyzxd · 2026-06-05T19:13:37 1780686817

I never care about AI usage disclosure, because I don't believe that human produced code is necessarily better than AI produced code, unless it's someone I personally know.

People need to be responsible for code they commit and push anyways. This has never changed. Whether the code is written by hand, by their cat walking over keyboard, or by AI, is not my concern.

A project's code quality can decline for all kinds of reasons. I don't think it's productive to laser-focus on whether it's produced by AI or not. That's a distraction. If a person just want to find excuse to criticize AI, and another person wants to fight back and defend AI, sure, go for it. But that's not how you would want to assess a project's code quality.

delusional · 2026-06-05T23:14:27 1780701267

> People need to be responsible for code they commit and push anyways.

Well the GPL (which rsync is licensed under) says: "This program comes with ABSOLUTELY NO WARRANTY" so actually nobody is responsible for anything.

calvinmorrison · 2026-06-05T19:56:31 1780689391

something as simple as requiring sign-offs like the DCO maybe relevant to people who care. I do think the driveby stuff may get smaller. People dont need to get stuff upstream. I have lots of patches I am keeping downmstrea and instead have a trigger system when new packages updates drop into debian and i rebuild the package with my patches on top using quill. Other systems like gentoo basically always supported this flow.

So - why bother forking or going upstream? maybe its selfish. I think publishing the patches are cool but I feel less of a need to force other people into doing what I want or even writing every possible configuration or solution. I just hack it for me

matheusmoreira · 2026-06-05T18:51:49 1780685509

> You’re just going to make people disable Claude attribution on their commits to avoid drama.

People should be doing this regardless of drama. No reason to provide free advertising for trillion dollar corporations. Generated-by trailers are only relevant when contributing to third party projects, in that case disclosure is polite.

Aurornis · 2026-06-05T19:33:43 1780688023

The value of the Claude attribution is that you can tell at a glance who used AI.

I don't care about the advertising angle. We all know Claude by now. I want some indicator that AI was used.

block_dagger · 2026-06-05T21:04:29 1780693469

At my employer, if AI is not used, it shows up on your performance report and you’ll be told if you don’t start using it, you will be dismissed. I work at a medium sized successful YC-backed SaaS. So here, the attribution is meaningless - they look at your Bedrock and LLM API calls as well as Claude Code history.

Aurornis · 2026-06-05T22:55:20 1780700120

If the company policy is to have everyone using it then everyone is going to assume you're using it.

I don't see a need for an attribution line in this case.

fragmede · 2026-06-05T22:53:12 1780699992

Do you fellow ICs have access to those reports and can correlate commits from you to the prompts used to create them easily?

matheusmoreira · 2026-06-05T19:42:30 1780688550

And why do you want to know that? So you can call our projects slop? Ostracize us?

Hammershaft · 2026-06-05T19:55:12 1780689312

Because LLMs are not humans, and the code they produce will have a different distribution of failure modes than human written code, so attribution is useful info while reviewing?

matheusmoreira · 2026-06-05T20:11:06 1780690266

> while reviewing

As I said, disclosure is polite when contributing code to third party projects which will undergo human review.

No need for such things in one's own projects.

Groxx · 2026-06-05T20:21:08 1780690868

>which will undergo human review

This can be largely assumed to be true for any open source code. It's kinda the point of open source.

matheusmoreira · 2026-06-05T20:31:13 1780691473

Nope. It cannot be assumed at all. Maintainer could just as easily tell Claude to review the hand written code you sent instead of spending any effort on it. Maintainer could sit on the patch for months on end only to swoop in later and rewrite it instead of engaging with you, thereby erasing your contribution and attribution. Maintainer could just ignore you entirely despite the pervasive "patches welcome" attitude.

If there's one thing I learned not to do in open source, it's to assume nonsense like that.

Groxx · 2026-06-05T20:39:57 1780691997

I'm referring to the fact that "open source" quite literally means "readable by humans [and machines]", and anything beyond that is a subject of debate. There are more users than readers in nearly all cases, but being able to read the code as a user is a significant benefit at times, and it's one of the reasons it's such a large ecosystem in terms of both users and contributors. (it usually being free is another big reason, of course)

Even with coding agents gaining popularity, many humans still look at the code at some point.

matheusmoreira · 2026-06-05T20:51:50 1780692710

I see. That depends on how much I care about the project. My favorite ones get weeks of review and refinement, to the point I still consider them to be more or less hand written. Not all projects get to be that important.

toofy · 2026-06-05T22:46:46 1780699606

for the same reason we want to know who wrote an article, a book, a movie, a song, a play, a journal paper, a painting, and on and on.

why do you so many people want to hide who the real author is?

we should be very weary of anyone claiming they’re the author of something when they’re absolutely not. if jon wrote a book and i take credit, that’s shady as hell.

matheusmoreira · 2026-06-05T23:11:02 1780701062

Ghostwriting is a thing.

ezst · 2026-06-05T20:34:43 1780691683

Some people prefer organic grown food for all kinds of reasons, does it matter to you they would want the same for code? (Also, I'm not picking a side here)

matheusmoreira · 2026-06-05T20:37:29 1780691849

It matters when I'm contributing to their projects. In that case I'll go out of my way to be polite and learn their rules.

Aurornis · 2026-06-05T19:52:57 1780689177

You don't need an AI attribution tag to recognize slop. In my experience reviewing PRs, the slop-pushers are most aggressive about stripping the AI attribution anyway. It's the normal devs who use a little bit of AI who leave it in.

The tag is helpful because AI authorship is different than the human authorship. When you work with a project or team for long enough you start to trust certain people and their intuition, but when they start submitting AI-produced code you have to reset and review it like AI code.

I use these tools a lot, too. But I want to know where the code came from so I can review it accordingly. The source matters.

> Ostracize us?

I don't know why you're so defensive. If AI wrote the code just be honest about it.

If you outsourced the code writing to some guy named Bob on Fiverr, I'd want to know that too.

matheusmoreira · 2026-06-05T19:57:57 1780689477

> I don't know why you're so defensive.

Check it out:

https://lobste.rs/s/29pm2f/llm_generated_submissions_should_...

https://lobste.rs/s/ytim7h/collection_small_low_stakes_low_e...

Aurornis · 2026-06-05T20:10:37 1780690237

I'm not interesting in joining into some argument you're having with someone on lobste.rs

matheusmoreira · 2026-06-05T20:13:12 1780690392

You're not supposed to join. You said you didn't know why I was defensive. I showed you those posts as evidence of the stigma attached to LLMs and their usage. Now you know why.

eschaton · 2026-06-05T21:08:37 1780693717

It doesn’t help your case that your response is to say “well I’ll just hide my use.” That’s fraud.

matheusmoreira · 2026-06-05T21:22:41 1780694561

eschaton · 2026-06-05T21:59:09 1780696749

codygman · 2026-06-05T19:45:57 1780688757

So that the AI model that generated code can get proper credit and we'll know to use (or not use it) next time.

matheusmoreira · 2026-06-05T19:53:34 1780689214

That's not at all what someone who wants to "tell at a glance who used AI" actually wants to know.

eschaton · 2026-06-05T21:06:24 1780693584

So we can know which commits will be infringing others’ copyright.

julianeon · 2026-06-05T18:58:14 1780685894

If Claude is actually good enough to commit to rsync, of course I'm going to look at that and think "it's good enough for my side project too." And (benefit to companies aside) that is info it is useful to know, if it's true.

amiga386 · 2026-06-05T19:05:25 1780686325

Yeah, this is why it's obnoxious and this is why scummy marketers do it. If you don't aggressively turn it off, they leech an implicit endorsement out of you.

- Sent from my iPhone

AnotherGoodName · 2026-06-05T19:31:21 1780687881

Alto hug the iphone sigoff is hilaripus sonce fhe meyboard is so bad it always comes across asa an ask doe forgivebeds

— Sent from my iPhone

AlienRobot · 2026-06-05T19:22:07 1780687327

Indeed. The best endorsement is done explicitly by obnoxious users.

I use Linux, btw.

trwired · 2026-06-05T18:49:58 1780685398

Is that a bad thing? I mean from the perspective of Anthropic's marketing department sure, but if agents are just another type of tool in developer's tool belt - as I see people recently like to claim - attribution feels kinda weird. In the end it is the developer who is responsible for their commits.

eli · 2026-06-05T18:54:00 1780685640

Yeah I think it's a bad thing. It's context about how open source code was written that is lost.

And I guess maybe there's no such thing as bad press but at least in this cases it doesn't seem like effective marketing for Anthropic.

eschaton · 2026-06-05T21:05:21 1780693521

“Don’t get mad at people for doing something unethical or immoral, or they’ll do something unethical or immoral!”

Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code.

Of course that fits right in with the use of an LLM to generate code in the first place, since what it’s actually doing is regurgitating its inputs stripped of any license and copyright notice.

jhack · 2026-06-05T22:41:13 1780699273

"Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code."

Should there by attribution for Google or Stack Overflow copy/paste? Who should we bully about this?

eschaton · 2026-06-05T22:46:48 1780699608

Yes, in fact, this is why people who do that are looked down upon.

They are in fact committing fraud if they do not attribute the code in their commit properly, because by committing it they’re claiming to have rights by virtue of authorship that they do not have. (Namely, the right to contribute that code to the project,.) They may also be committing copyright infringement, depending on the copyright and license status of some code they found via Google or Stack Overflow.

It’s always fascinating to me to see how many people on Hacker News have such extremely poor understanding of how intellectual property actually works, and how misrepresenting themselves or their work can actually have consequences.

umanwizard · 2026-06-05T23:17:06 1780701426

> Should there by attribution for Google or Stack Overflow copy/paste?

Obviously, and I'm a bit taken aback that anyone thinks otherwise.

UebVar · 2026-06-05T21:26:21 1780694781

I'm very certain that this is not fraud, across multiple legal systems, both roman and common law. In both cases fraud requires a person is deprived of a material good. Neither the defrauded person or their material loss is present in this case. Maybe there is a oddball legal system somewhere in the world where fraud is something entirely different, but i doubt it. "Fraud", just like "Decorator Pattern" is a well established concept and pretty simple concept, even if there are edge cases. This does not fit at all.

In academia this is miss-attribution, outside of academia this does not exist.

This is clearly not not copyright infringement either as LLMs do not claim copyright, nor could they. Just like the photograph taken by the monkey, or pictures drawn by crows. LLM output is not a creative work either.

If this is unethical or immoral is a totaly different question. I really dont think so and I dont think you argue that position well.

eschaton · 2026-06-05T21:59:57 1780696797

It is misrepresentation for gain, that gain does not need to be monetary to be material. For example, it can be reputational.

It also is copyright infringement, because what the LLM “generates” are actually portions of its training set, which were covered by copyright. Just passing through an LLM does not remove that copyright from that work.

infamouscow · 2026-06-05T22:32:12 1780698732

It's only fraud if a person signed their name stating such.

Their name being attached to the commit is itself, irrelevant, as their is no way to submit a patch otherwise. You could use a fake name, but you're just moving this fraud problem around.

You're going to have a hard time convincing anyone that using a tool constitutes fraud. Frankly, it's silly, if not genuinely stupid.

Film photographers in the early 2000s routinely called digital "not real photography" and Photoshop "cheating" because you could delete bad shots and fix everything later. Traditional musicians and critics dismissed drum machines, synthesizers, and autotune as soulless tools.

eschaton · 2026-06-05T22:40:25 1780699225

Intent and custom both matter quite a bit in law. It is customary to treat the name attached to a commit as the copyright holder of any changes represented by that commit, just as it was for the sender of an email containing a patch back when that was how such work was done.

Often this is also spelled out in a project’s contribution guidelines, and some projects have even had more explicit copyright assignment policies they required contributors to agree to, but the lack of such guidelines or assignment policies does not mean the custom as normally observed in the field is irrelevant.

overgard · 2026-06-05T19:34:18 1780688058

I mean, I don't think commits are the place for tool attributions. I want to know what the change was, I'm not really interested in your tool selection (put that in the PR if it's relevant). It'd be just as irrelevant to see "written on my macbook in neovim"

hnav · 2026-06-05T19:37:57 1780688277

Depends on what the claude attribution actually means. A lot of people will just get the thing building and then ship. To me that attribution is generally a red flag.

eschaton · 2026-06-05T21:10:00 1780693800

It means “this contribution likely infringes someone else’s copyright.”

potsandpans · 2026-06-05T18:47:26 1780685246

I think it will be funny to watch people lose their collective minds when open source maintainers start requiring llm use.

This idea that the community can try to pressure an open source maintainers about the tools they use based off of kneejerk political reactions is so offensive.

Let's go the opposite way: "sorry I'm closing this pr because it didn't use an llm."

matheusmoreira · 2026-06-05T18:53:18 1780685598

It makes no sense at all to do that. The only thing that matters is whether the code is good.

eschaton · 2026-06-05T21:22:58 1780694578

That’s not the only thing that matters. The provenance of the code also matters enormously, specifically whether the person contributing it actually has the right to do so.

If I contributed code to an Open Source project behind my old employer’s back, that would have been bad, because that code was owned by them and not me, even if I wrote it on my own time using my own equipment, because of the contract I signed with them.

If I copied code out of an AGPLv3-licensed codebase and contributed it to a BSD-licensed codebase without telling anyone, that would have been bad, because I did not have the right to change the license on that code to BSD (or change the license on the codebase to which I was contributing to AGPLv3).

If you use an LLM to produce code, you may well be doing the latter since an LLM is actually just regurgitating portions of its inputs. This is not a hypothetical scenario; I’ve personally encountered a case of someone using an LLM attempt to contribute code I recognized from a specific Open Source project under one license to another project under a different license, while claiming they “wrote it themselves.”

Any project that accepts contributions needs to take liability seriously and manage their risk appropriately.

matheusmoreira · 2026-06-05T21:28:41 1780694921

"LLM produced licensed code and person contributed it" is indistinguishable from "person contributed licensed code". The LLM is irrelevant. Result is the same as if they had copy pasted it.

eschaton · 2026-06-05T22:05:33 1780697133

Yes, exactly.

Unfortunately, a large number of people are being told—and here, you can see many who believe it—that the output of an LLM either carries no copyright or is copyright by the one prompting it. In other words, even right here on Hacker News it’s widely believed that LLMs “launder” copyright.

matheusmoreira · 2026-06-05T22:25:00 1780698300

Irrelevant either way. It's your name on the commit, and the code either infringes or it does not. Whether an LLM was used is immaterial.

eschaton · 2026-06-05T22:28:37 1780698517

Not irrelevant. A large number of people who would not copy and paste code from one project to the another will attempt to contribute the copyright-infringing output of an LLM and not think twice.

potsandpans · 2026-06-05T21:44:19 1780695859

The genie is out of the bottle here. If this were true then all fortune 500 companies would be pearl clutching and limiting their developers access to these tools.

But for better or worst I can assure you (for which you have no reason to believe me, just look at the headlines): nearly all tech companies are setting internal goals to have x% of code generated by llms by y date. And speaking as an insider, that x number is very large and that y date is very soon.

And before everyone continues to downvote me because I'm saying things that you don't want to hear, you have to realize that this is the world we live in now.

So, either you're right and the legal entities attached to some of the most powerful tech corporations have just decided to flaunt the law. Or you are missing something, or the game has changed.

Open source projects that want to hide behind provenance as a gate keeper to introduce llm generated code into their code base are going to get smoked.

There's nothing stopping a company like anthropic from funding an open source division that starts forking projects and accelerating the development. Expect 1000x more Buns.

There's nothing stopping an wealthy individual who wants to do that.

When the dust settles, no one is going to be worried about what you've typed here.

And if somehow the ip lawyers and capitalists won, then China will become the tech hub of the world.

Whether it's right or wrong, that is the reality.

eschaton · 2026-06-05T22:12:42 1780697562

The Fortune 10 company that I spent decades at and retired from just a couple years ago noticed this issue immediately and issued a blanket ban on the use of these tools for the company’s own code that to my knowledge has not been rescinded. (They also started developing their own coding-specific LLM, training solely on code they owned, around the same time.)

You might consider that there is a very large incentive by the large and public players in this market to promote the idea that this is not true, that they consider themselves large and powerful enough to actually flout the law, and that they plan to use the argument that enforcement will be too damaging to the economy to make their view the “new normal.”

This playbook has been run before, by Uber and Lyft, by AirBnB, by Tesla with “FSD,” and so on. It’s very clearly the approach being taken.

potsandpans · 2026-06-05T22:25:18 1780698318

Well, I've personally worked at 3 of the fortune 10s (two from pre llm mania days) and I know for a fact that they're full tilt, from keeping up with old colleagues, plus where I'm at currently.

I just looked at the list and I have friends that work at most with the exception of United, mkesson, Berkshire and cencora, so either you were at one of those or you're misinformed about your ex employer.

The entire industry for the most part is all in here.

We clearly disagree at an ideological level, for which I will not try to convince you my side is correct.

Instead, I would probably be willing to bet overall maybe 10k USD that your stance is generally not representative of where we end up in 5 years.

Let's make a Polymarket and compete with dollars instead of words (slightly in jest)

eschaton · 2026-06-05T22:30:27 1780698627

Or you’re misinformed about what my old employer is actually doing, or how they’re doing it.

potsandpans · 2026-06-05T23:12:56 1780701176

I'm not

archagon · 2026-06-05T22:57:42 1780700262

Is this comment LLM generated?

Have fun with 1000x more Buns that literally no one is using or maintaining. An entire software industry built on top of a burning garbage pile of crappy, dead code.

potsandpans · 2026-06-05T19:25:48 1780687548

This is my whole point. The whole thing is ludicrous.

And lo and behold, people are losing their collective minds, bridgading my posts, flagging me and demanding credentials.

automatic6131 · 2026-06-05T18:48:40 1780685320

"let's go the opposite way"

Do you have any popular open source projects? Or are you just an Internet gremlin?

potsandpans · 2026-06-05T18:59:45 1780685985

I'm a successful distinguished engineer within mag 7, what are your qualifications? Please send me your resume and social security number to verify that you're qualified to speak on the matter.

mohamedkoubaa · 2026-06-05T20:25:16 1780691116

I'd be willing to be that an undisclosed LLM disclosure will follow a developer around for the rest of their career

eschaton · 2026-06-05T21:23:27 1780694607

That kind of fraud absolutely should. (I suspect you mean “undisclosed LLM use.”)

mohamedkoubaa · 2026-06-05T22:04:28 1780697068

Thank you, that's what I meant

scsh · 2026-06-05T13:12:34 1780665154

> It does not control for commit complexity, security intensity, or bug severity. It does not distinguish between a one-line typo fix and a CVE patch. It is a blunt instrument. But the critics' accusation is also blunt: "Claude is making things worse." A blunt instrument is the fairest response.

If by fairest you mean to say that this analysis and response is sufficient, then I'm sorry but I have to disagree. We really need to understand if the nature of the bugs are worse from a user's perspective. Even if the rate stayed unchanged, if the result is the perceived quality of the software declined then I would personally consider that worse, especially if I were a project maintainer.

That's not meant to be wholly dismissive either. But in general, I don't think quantitative analysis alone is enough to fully answer this type of question.

skeledrew · 2026-06-05T13:58:40 1780667920

But it is fair. Up to this point I have yet to see anyone say they did an analysis of the code and found X regressions of Y severity. All they say is "there are more bugs because LLM". This analysis, which you can verify yourself if you wish, says "the bugs [number of] are pretty average even with LLM", which is a direct response to that. If you'd like a more nuanced analysis you're welcome to do one and share the result, if you're so inclined.

MostlyStable · 2026-06-05T19:37:43 1780688263

That which is asserted without evidence can be dismissed without evidence. This is more evidence, and of greater rigor, than was used to make the assertions. That's good enough for me. If someone wants to actually do the work to support the original claims with better evidence, great. I'd love to see it. Until then, I'm going to not worry about this issue.

lbrito · 2026-06-05T20:33:19 1780691599

Wait, how is any of this relevant if there were only 2 Claude commits? My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

logicprog · 2026-06-05T20:35:03 1780691703

Depends on the methods you use. If you're trying to fit curves and so on, yes. The methods I use were designed for very low amounts of data, and are generally okay for that, specifically and especially when you're just trying to show a lack of evidence for some non-null hypothesis.

And again, that's kind of the point. There's exactly zero actual evidence, however you slice it, that "Claude broke rsync" except cherry-picked anecdata, and the whole point of my analysis is to demonstrate the total lack of any such trend/evidence at all, and just how in-distribution/normal these releases are, to show that if people hadn't known Claude was involved in them, they wouldn't have remarked on them.

wlonkly · 2026-06-05T22:57:15 1780700235

It's not uncommon to have small amounts of data come out of experiments. These are appropriate tests for the size of the data. These tests failed to disprove the null hypothesis.

mikaeluman · 2026-06-05T19:13:11 1780686791

Not going to critique this survey. Must have taken a lot of time and required a lot of patience. Great work!

I think it will be up to some group in academia to make a real full blown study across several repositories.

There must be tons to learn on how LLMs have changed software development and perhaps the cleanest separation will simply be going by what repositories declare e.g. "No LLM involved" vs those that proudly do the opposite or are neutral.

Bugs is not the only variable of interest here. I am guessing someone is already doing this as we discuss it here...

dvt · 2026-06-05T21:52:02 1780696322

It's always the most insufferable people that make the biggest hullabaloo about a project they have nothing to do with and have never contributed to. People with literally zero skin in the game using the AI boogeyman to push some agenda or some anti-agenda. OSS has become so incredibly toxic in the past decade, and consumers of OSS have become extremely entitled.

I run a smallish project with ~1k stars and I've stopped maintaining it last year because people feel like they're absolutely owed features or bug-fixes or whatever. It's tiring and a complete shame that author has to make such an insane deep dive into a random accusation that just caught on social media. I want to emphasize that this has nothing to do with AI, it's just tech tourists, consumers (as opposed to creators), and engagement farmers that have taken over. AI slop probably doesn't help, but the underlying issue has been brewing for at least a decade.

Also, the "making soup for the homeless & pissing in it" is not only an off-base analogy (software is pretty low on Maslow’s Hierarchy of Needs), but also somehow looks down on both people in need and the volunteers that help them. Just absolutely gross.

Panino · 2026-06-05T23:24:38 1780701878

> It's always the most insufferable people that make the biggest hullabaloo about a project they have nothing to do with and have never contributed to.

Agreed, and similarly, as a hobbyist programmer who loves Rust and Go, I've always felt that the people who command others to "rewrite it in xyz" are not themselves developers, they're "ideas people." There's a mass of these people whose main interactions with the world are through the dramatic forcing of their correct opinions.

> I run a smallish project with ~1k stars and I've stopped maintaining it last year because people feel like they're absolutely owed features or bug-fixes or whatever.

That's a bummer and it's something I'm fearful of. I post some code on my website, not on a github type site, and don't interact with people about it. It's nice and plenty of people do it. Is that something you'd consider?

faitswulff · 2026-06-05T13:04:10 1780664650

> The analysis uses a single metric: bugs per 10 commits (bugs/10c).

Bugs per commit as a metric papers over severity, both in terms of security severity as well as the effect on the user. A mislabeled button has the same weight as the entire app crashing in this framework.

germanjoey · 2026-06-05T18:26:42 1780684002

IMO "bugs per commit" is even worse than that, because, in addition to what you say, it also hides the extraordinary spike of commit activity of a project that had previously been stable. [0]

It is the exact metric you'd choose if you wanted to make the current situation of rsync look like not a big deal.

[0] https://github.com/RsyncProject/rsync/graphs/commit-activity

logicprog · 2026-06-05T18:42:33 1780684953

Yes, but we know why there was an "extraordinary spike," and it has nothing to do with rsync being "vibe coded." The maintained has directly addressed this.

vsundar · 2026-06-05T21:21:03 1780694463

> The maintained has directly addressed this.

Not sure if this is mentioned somewhere else, but looks like the maintainer has a blog post that addresses this: https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

floxy · 2026-06-05T18:44:40 1780685080

Seems like this would be a good place to link to that.

logicprog · 2026-06-05T18:54:28 1780685668

I link to it multiple times in TFA and quote the specific thing I'm talking about here in there to explain that possible confounder. I think I've done more than the work I'm obligated to it.do to make all of the relevant information available to you. You are just refusing to use

runarberg · 2026-06-05T19:23:01 1780687381

I am not finding these links in TFA, I see a link to an issue #929 which (as mentioned in TFA) has over 350 replies, and and opinionated summary of what transpired, including some detailed description of specific posts there. However I did not find the maintainers response.

Of interest is this post here: https://github.com/RsyncProject/rsync/issues/929#issuecommen... which echos the same concern which was raised up thread, however, I failed to find the maintainers’ response.

EDIT: Found it! it is in the (untitled) discussion section (after the results).

https://lobste.rs/s/k1b0za/rsync_outrage#c_2iowov

EDIT 2 (and advice on design): The page design changes backgrounds after the results sections, which kind of conveys to the user that they have reached the end of what was is important and can just skim over the rest (usually pages have a radical change in typography like these when you’ve reached the comment section), however this is what is analogous to a discussion in a typical paper, and is arguably the most important part. I had simply assumed that you just left it at the result and skipped the discussion as a stylistic choice.

logicprog · 2026-06-05T19:33:47 1780688027

> EDIT: Found it! it is in the (untitled) discussion section (after the results).

I also paraphrase Tridge himself explicitly saying that this is why commits/releases have increased:

> Essentially, this isn't a "Claude" problem, it's a "more security work" problem, something that Tridge himself confirmed in his response, describing how a flood of AI-generated CVE reports forced rapid, extensive changes to rsync's attack surface.

> The page design changes backgrounds after the results sections, which kind of conveys to the user that they have reached the end of what was is important and can just skim over the rest (usually pages have a radical change in typography like these when you’ve reached the comment section), however this is what is analogous to a discussion in a typical paper, and is arguably the most important part. I had simply assumed that you just left it at the result and skipped the discussion as a stylistic choice.

Good point, I assumed everyone would read till the end, that's on me. I'll give it a heading.

ex-aws-dude · 2026-06-05T18:19:56 1780683596

Why don't you prove the bugs increased then?

Why is it that some unfounded claim is made and the onus is suddenly on the project maintainer to prove it beyond all doubt?

It should be on the person making the claim to prove it

logicprog · 2026-06-05T19:37:00 1780688220

I've now resolved this. The new version, which should be live on GH Pages soon, uses — what I think is — a pretty good methodology for assigning severity to each bug, normalizes it to 0.0-1.0, sums that, and treats that as the total severity weighted bugs, then does the analysis based on that. It did not change the analysis in any material way.

skeledrew · 2026-06-05T13:48:20 1780667300

There was no analysis of severity in all of the rage posting that occurred. The single point being pushed was "use of an LLM led/leads to more bugs". The author specifically states that's what they're addressing (blunt accusation -> blunt response).

atmavatar · 2026-06-05T16:33:06 1780677186

The specific problems mentioned were all reasonably severe. The original post itself described a show-stopping bug:

    So my systems recently updated to rsync 3.4.3, and as soon as that happened my backup system - which does incremental backups using multiple --compare-dest= arguments - started to fail on anything but a full backup.

Incremental backups is perhaps the primary use of rsync, and they were broken for this person. That's pretty severe.

The second reply is similar:

    i wondered why my 3d printers were running like sh*t and at 100% cpu; turns out log2ram uses rsync.

This one I took with a grain of salt, since it read more like a dogpile than an actual bug report. However, if it's genuine, it's also reasonably severe.

Later in the comments, someone attempted to provide a list of issues that had been added: https://github.com/RsyncProject/rsync/issues/929#issuecommen.... The list included several failures to build or run rsync that appear to have resulted from broken backward compatibility. That seems reasonably severe. If intentional, I would have expected mention in the release notes about the removal of backwards compatibility, but none was made.

The issue comments already degraded into a lot of unnecessary vitriol even before the above mentioned comment and only gets worse from there, so I stopped. But, the fact remains that the whole issue started with a severe bug.

I applaud the attempt at dispassionately analyzing whether the recent LLM releases of rsync were normal or outliers as far as bugs are concerned, but I don't think you can do so properly without analyzing severity.

skeledrew · 2026-06-05T18:05:04 1780682704

To keep such an analysis fair and contextually relevant, it would have to be extended to the previous 928 issues as well (of course filtering for bug reports). I don't see anyone doing such an analysis, I think because they don't expect they'd find it useful (at least not as the rage fuel that many are seeking); what they'd be more likely to find is that there is a similar severity-mix going all the way back to v1.0.0, because these things inevitably happen whether coding is done by human or machine.

"A lot of claims in the wider discussion have treated every recent bug report as if it had the same cause. That is not accurate. Some reports were regressions from recent security hardening, some were missing historical test coverage, some were older bugs found because rsync suddenly had more eyes on it (especially by AI that can find issues quickly) and some were packaging or environment-specific failures. A Co-authored-by line is not enough by itself to establish root cause." - https://github.com/RsyncProject/rsync/issues/929#issuecommen...

logicprog · 2026-06-05T13:19:10 1780665550

Okay, I really have to point out to everyone: the numbers and report cards are TEMPLATED IN BY A SCRIPT. Hallucinations are a moot point. https://github.com/alexispurslane/rsync-analysis/blob/main/s...

parliament32 · 2026-06-05T21:23:10 1780694590

Thank you for (re)writing this in your own voice. Despite how much effort might be put into methodology, data collection, etc.. reading slop is unbearable, full stop. It's not intentional, but I have almost a nauseated reaction when the "AI tone" comes though, regardless of how good the data or how accurate the writing is.

Your verbosity and sentence structure are not a problem. I hope that publishing this gives you a bit more confidence in your writing, because it's legitimately good.

AEVL · 2026-06-05T21:19:11 1780694351

How does the analysis look if we only count the >=90 severity cases—that is, if we downgrade the severity of all <90 cases to 0?

geraneum · 2026-06-05T13:01:27 1780664487

> But the critics' accusation is also blunt: "Claude is making things worse." A blunt instrument is the fairest response.

So the criticism was bad, and that somehow makes it ok to use a bad metric?

logicprog · 2026-06-05T13:03:32 1780664612

That's not what I'm saying. What I'm saying is that if the criticism is referring to a broad set of metrics like bugs per release and number of commits that were made by Claude, then it's correct to look at precisely those things because that's what the claim is about.

abirch · 2026-06-05T18:21:25 1780683685

AI + Interest != Expertise

I come to hn because I get very nuanced, informed information and glorious puns.

epolanski · 2026-06-05T20:20:13 1780690813

What would be a better one?

iainctduncan · 2026-06-05T23:10:18 1780701018

What strikes me about the post is that it goes to great lengths to talk about proper statistical methods, but then is written in the most clearly biased language ("what stupid AI haters get wrong etc). If you want people to take your study seriously, why wreck it by coming across with such a strong prior bias? I stopped reading...

rovr138 · 2026-06-05T12:56:02 1780664162

I'm just curious about testing.

Is this a configuration that's not common and thus not tested?

If people think they can do better, I want to see their forks and them keeping up with it.

https://github.com/RsyncProject/rsync/graphs/contributors?fr...

tptacek · 2026-06-05T19:26:45 1780687605

This is a neat post and I'm glad it got written and this is a little bit off-topic but:

Hey, 'logicprog, your writing is fine!

Use LLMs to critique your writing, check its structure, vet your choice of topic sentences, check flow from graf to graf and section to section, look for passive voice and overused words. LLMs are fantastic for that. But don't use a single word an LLM suggests in your actual writing. If it suggests something really fucking good, too bad, those words are disqualified. It's an easy red line to adhere to, easier than it sounds, and it'll keep your writing human.

(You ended up somewhere around here anyways, but that was after you posted something with LLM-written language because you weren't confident enough in your own writing. The things you do "worse" than an LLM are what make you you; be protective of them!)

logicprog · 2026-06-05T19:45:20 1780688720

Thank you!

Polarity · 2026-06-05T12:58:49 1780664329

so the answer is: no. actaully less bugs. thanks

gjvc · 2026-06-05T19:14:26 1780686866

"fewer"

WesolyKubeczek · 2026-06-05T20:29:23 1780691363

The discussions around this have devolved to excrement anyway, I feel tempted to invoke the meme where the goose asking a guy what his jacket is made of, asks “where is your reproducer case!?” instead.

Instead we have a shitstorm over presumably legit issue, for which the only source is some mastodon post.

One command that used to work in 3.4.1 and stopped working in 3.4.3. Just one! We could have already bisected the living shit out of this and go home, but no.

logicprog · 2026-06-05T19:17:10 1780687030

Another update: did an automated severity analysis on each bug report (~2000 of them!) using an LLM at temp=0 with a very strict rubric (and I checked to make sure that it rated things in a consistent, stable way using it). The rubric, LLM used, and some example ratings are included in the methodology section. For now, the information was just stored per-bug in the DuckDB and used to filter out non-bug bugs, to get a clearer signal. I'm going to try to use it to see if the post-Claude bugs were more severe in any way next.

KronisLV · 2026-06-05T19:36:27 1780688187

Pretty cool site!

> v3.4.3 has been out long enough that its rate (5.00) is already comparable to historical releases. The "wait and see" argument is an appeal to an unknowable future that shifts the burden of proof away from the critics. If more bugs surface, they will enter the distribution like every other release. There is no reason to expect a regime break.

I mean, as someone who uses LLMs, it might be a good idea to consider how one might limit the amount of bugs that will appear in the future at least a little bit: parallel iterative code review loops would probably be the easiest and most applicable to LLMs, though I guess test coverage and other code analysis tools help too.

steno132 · 2026-06-05T20:32:37 1780691557

This is just narrow thinking. Say Claude did increase the bugs in rsync by a negligible factor.

So what? You've saved a significant amount of time for a decent number of humans, and if those humans are working on other projects, the overall net output for the world is net positive compared to without LLMs.

You have to broaden your perspective. It's not just about how rsync was affected.

boxed · 2026-06-05T20:33:59 1780691639

Let me translate this comment:

> ok, so I was wrong and badly, but I will double down and say I was right anyway

tiahura · 2026-06-05T21:49:16 1780696156

Write with your own voice and then polish with ai.

dgellow · 2026-06-05T21:57:17 1780696637

Or just do not polish? Write with your own voice accept it as it is, humans communicating to humans

overgard · 2026-06-05T18:41:48 1780684908

The TLDR seems to be: needs more data.

WhereIsTheTruth · 2026-06-05T22:15:57 1780697757

LLMs don't create bugs, people do

PunchyHamster · 2026-06-05T19:59:25 1780689565

The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

vintagedave · 2026-06-05T20:33:57 1780691637

Why not? Claude marks its commit messages. That there were none, and then there were, seems a signal.

Especially since if the earlier commits were so clearly AI authored yet without the Claude marker, surely you or anyone would be able to spot them. You could say, X commit does not have the Claude commit marker yet was AI written. But for all the speculation on this thread, I haven’t seen anyone actually doing that. What may be possible is that the rsync maintainers used AI to assist yet reviewed and edited themselves, as many devs do, and if so then the stats in this article are still notable: there are no poor quality outliers that can reliably be attributed to AI and if one specific release (3.4.0) was, the subsequent releases which presumably also had as much AI as this speculative hidden AI release only show improvement and thus act as a pro-AI argument.

The blog has many more datapoints than two. It compares many releases. You’re looking at 2-vs, not 2.

logicprog · 2026-06-05T20:23:20 1780691000

> Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

I'm using methods appropriate to that low amount of data, first of all. Second of all, since I'm only trying to show there's no evidence for the anti-AI hypothesis (not disprove it, or prove the null hypothesis), that's sufficient in itself. Also, I wonder why nobody said things like you're saying ("there's too little data to tell") in response to all the absolutist claims that AI caused rsync to get worse?

> The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

At this point, you're just positing Russel's Teapot: you'll keep assuming more and more of the code was "secretly" Claude when there's no evidence for it and no reason to think so, just because you've started with the assumption that Claude makes things worse and you want to find a way to prove it.

yobid20 · 2026-06-05T19:46:28 1780688788

needs a tldr; im not reading all that. maybe claude can summarize it for me.

logicprog · 2026-06-05T20:29:32 1780691372

And anti-AI people accuse people who use AI of being intellectually lazy. First of all, it's long because it's expanded to respond to all the criticisms. It seems that either something can be short, and dismissed as incomplete, or it can be complete, and dismissed as being long. Nice Kafka trap. Additionally, there's literally an Executive Summary section right there, for your TLDR.

noAnswer · 2026-06-05T21:24:30 1780694670

Asked your Clanker what a joke is.

themafia · 2026-06-05T20:24:00 1780691040

> If anyone complains about my verbosity or sentence structure — as they usually do, which is the reason I originally let the AI write the prose, among other reasons obsoleted by templating — they can go fuck themselves.

You can write for an audience or you can write for yourself. Which is fine either way but you shouldn't pass the blame for bad results on to your audience.

> and recieving almost no substantive input, discussion, or response on the actual content of the article

Well did you write it for that purpose?

> "Just wait, more bugs will surface" -- v3.4.3 has been out long enough

Wait for _more releases_. As your own data shows the bug rate is not consistent between releases. So this is probably not a worthwhile metric. Perhaps systems touched, new features included, or attempted fixes would be a better way to contextualize releases and the goals of the author.

pushcx · 2026-06-05T14:40:52 1780670452

    What followed was extraordinary: 329 comments and counting, ranging from thoughtful concern to outright harassment.
    The thread did not stop at words. One user posted My Little Pony drawings of themselves strangling the "project janitor that pushed vibecoded commits":
    It spread to Hacker News and Lobsters, generating hundreds more comments.

This is false, it did not appear on Lobsters. Here is the function in the codebase that prohibits this kind of brigading: https://github.com/lobsters/lobsters/blob/main/app/models/st...

Please correct your article.

tptacek · 2026-06-05T18:58:22 1780685902

It is neat that Lobsters has this feature (and HN should too), and I'm glad you took a beat to explain it. I think you didn't need the last sentence, though.

logicprog · 2026-06-05T18:45:00 1780685100

I have done so! that was a misremembering on my part. first mention of Lobsters is now here:

> On Lobste.rs, in response to the Medium essay Tridge himself posted in response, finally some users like boramalper begin to actually ask for evidence one way or another:

nairboon · 2026-06-05T12:58:17 1780664297

Is this an analysis made by/with Claude?

quentindanjou · 2026-06-05T13:12:49 1780665169

It very obviously is. "The Outlier Nobody Noticed" -_-"

overgard · 2026-06-05T19:15:18 1780686918

FWIW, I asked ChatGPT to review the article just for my amusement. It's conclusion was:

"My honest assessment is that this is a competent calculation performed on a badly confounded measurement, followed by conclusions substantially stronger than the calculation warrants. It is useful as a rebuttal to “the Claude releases are obviously unprecedented disasters,” but not as evidence that Claude was harmless."

dang · 2026-06-05T18:37:48 1780684668

[stub for offtopicness]

[see https://news.ycombinator.com/item?id=48416020 for how all this happened in the first place]

logicprog · 2026-06-05T12:58:17 1780664297

Some notes on this:

- I used GLM 5.1 to help with the coding and math for this.

- However, I explicitly dictated where the data should be pulled from (GitHub, Bugzilla, mailing list), how it should be tagged and grouped, and what data to look at (e.g. bugs instead of regressions)

- Additionally, I consulted with my wife, who has a master's degree in statistics from Penn State University for what sort of statistical methodology would be justified for this very limited data set, while still giving as much information as possible.

- I know the website looks like we stereotypically consider vibe-coded websites to look, but I actually explicitly asked for that. The original HTML design looked like a website from 1995, and I just prefer how this looks. It's pretty!

jchw · 2026-06-05T13:00:48 1780664448

I really struggle to believe you wrote text like:

> A simple distributional analysis of every rsync release with bug data. No model. No assumptions. Just placement.

logicprog · 2026-06-05T13:04:36 1780664676

No, I didn't write the text itself. I'm typically significantly more verbose and elliptical, and more than that, the numbers and methodology changed often enough over the course of the last couple days I was working on this because I was trying to get it to be as accurate and fair as possible that trying to keep the whole thing up to date manually would have been problematic.

jchw · 2026-06-05T13:27:31 1780666051

Sorry to say but I'm absolutely certain I would've preferred to read your worst attempt at a write-up over the grating utter shite LLMs output. It's not even a question, this is unreadable.

logicprog · 2026-06-05T13:30:52 1780666252

That's interesting; IME, most people get equally angry and are as likely to disengage with a superior tone over my autism-infodump verbose essay prose as with LLM output.

ok_dad · 2026-06-05T18:31:15 1780684275

At least when I write an autistic info dump people know I wrote it. Why give your voice over to a corpo slop factory?

Heck, I use LLM assistance for coding and I’ve even coded up whole features with the clankers, but giving it the right to speak for me is too much.

I should also add that I read and understand every line of clanker output that I publish for others, so I’m not a vibe coder either, just adhd.

skeledrew · 2026-06-05T13:43:21 1780667001

I read it perfectly fine. I see content, not style.

grey-area · 2026-06-05T18:35:38 1780684538

Style is also part of the content. Word choice, grammar, register, and tone all affect meaning and communication of that meaning. The medium is part of the message.

So your statement betrays a significant misunderstanding - there is no neat clean divide between style and content.

Also, LLMs often generate text that is plausible, but wrong, in ways big and small.

skeledrew · 2026-06-05T19:13:40 1780686820

Well, I got the meaning in the article fine, and have no complaints.

> Also, LLMs often generate text that is plausible, but wrong, in ways big and small.

So do humans. Always have, always will.

grey-area · 2026-06-05T19:46:53 1780688813

Humans acting with intention do it a lot less. The difference is that LLMs don’t act with intention.

skeledrew · 2026-06-05T20:15:55 1780690555

No, the difference is in the education/experience of any given human, which is mostly gated by age. Like you'd generally expect someone young to make a lot of mistakes, and as time went on they'd learn and make fewer. Pretty much the same with LLMs, which have been around for... a bit over 5 years now? What would you expect of a 5 year old acting with intention? Or 10? Or even a 15 year old?

jchw · 2026-06-05T14:29:29 1780669769

When you say, "I see content, not style," you are separating what is being said from how it is being said. While it is great that you can extract the core message, you are missing a fundamental truth about writing: style and content are rarely completely separate. Writing involves both.

Poor prose does not just make writing ugly — it creates friction, obscures nuance, and introduces ambiguity.

You can eat a gourmet meal out of a dirty paper bowl. You still get the calories, but the delivery mechanism definitely impacts the experience and the perceived value of the food. Same food, different response.

See? I can write slop too, I don't even need to burn down a forest to do it. If you are OK with every fucking thing being written exactly like this, good for you. I am not.

skeledrew · 2026-06-05T19:11:09 1780686669

The internet is going to really suck for you if you keep that attitude, because LLM use will only increase. Though also maybe not too much as the LLM-isms will likely be fine-tuned out of them to the point that the only way you'll be sure something is done with one is if the author left a note saying such. But maybe that'll make it suck even more as then you'd be without a definite target most of the time, always wondering how much of the thing you're reading is by human and how much by LLM...

jchw · 2026-06-05T20:06:01 1780689961

Uh... huh.

I waited a minute to make sure you weren't going to delete this post because frankly, if I had written it, I would have. Guess not, so... Here goes.

No. It is not the fault of my "attitude" that the Internet is going to suck. That is a complete reversal of the reality. The fact that even people without bad intent are already spreading slop everywhere should be enough evidence to essentially prove that there was never any hope. If this is what good actors are doing, what exactly do you expect from bad actors?

Also, to stress it yet again, I don't care if people use LLMs in general. I'll even say that I don't particularly care very much if people use them without disclosing it in most cases. If you're using it like a normal tool and not merely just dumping the output verbatim there is not any particular need to disclose it any more than you'd disclose other tools, though I think people would prefer if you did just for transparency.

My chief complaint is just how bad LLM slop writing is. It simply is not good at all. It would literally be much better for the Internet if they weren't so turboshit at writing. There is almost no writing style I don't prefer over garbage LLM writing. I'm dead serious. Early LLMs were worse at almost everything else, but they were a lot better at writing for sure. Something went wrong somewhere.

But I do also believe that it is inherently bad to dump prose as-if you are communicating as a human, but said prose isn't actually written by a human. If someone shows me a cool drawing that they made, that means that they sat there and went through the process of sketching, possibly multiple drafts, inking, coloring/shading/painting/etc. to create an expression. This involves many human skills that take years to hone, and every detail carries someone's explicit intention. I think that this is cool, and shows a great degree of skill and effort.

When you, of course, generate some crap from an image generator, it may very well look similar. It may emulate some actual defects that make it look like someone really drew it. But someone didn't. A model went directly from a text prompt and dumped out pixels on screen. No sketching. No layers. No thought processes about how to frame things or what details to include. That doesn't mean zero effort went in: I'm sure in many cases someone sat around and fudged with LoRas and inpainting for a couple hours and pulled the slot machine lever to get good seeds and etc. That doesn't mean that an AI model does not have some model for how to structure an appealing image: it does, that's obviously why the results can look decent to begin with. But when you dump out an image from an image generator and you wink wink nudge nudge present it as your own and people evaluate it as if you drew it, this is basically fraud. Everyone looking at it who doesn't know it is AI generated actually believes you went through the normal effort of drawing that image and all of the years of practicing skills and acquiring knowledge that takes. That's bullshit, and it takes away from the actual accomplishments of people who put in the work like cheating in sports does.

Like yeah, a lot of people are cheating at chess, by passing off engine play as their own, but does that really make it okay? When the entire point is using your brain and not just the raw outputs themselves, doesn't that hit you as a problem?

For generative AI, I personally draw this line at what I feel are expressions of creativity. If you use AI for drawing references, whatever. If you use AI to generate globs of repetitive code, whatever. Code can be creative but I do not view it as an expression of creativity and almost any tool is fair game. If you are using ML models for motion capture or some other data processing thing where humans had to do repetitive work before, whatever. Maybe these tools sometimes do devalue the work, but the LLMs are not doing the interesting part here, they're doing the boring part. (This is, in part, an admission that actually writing code is often pretty boring in and of itself, something that I realize programmers have been inconsistent with in an attempt to justify their value. But, I still believe it to be true.)

So okay fine. People are reluctant to disclose that they used AI to generate text because they fear the backlash that it will get them. This is understandable. What upsets me about this is that well-meaning people are apparently falling back to the idea that because LLM backlash is strong, what would be better than either trying to just simply write your own damn posts or be honest about your usage of LLMs... Is to just try to wink wink nudge nudge pass off more or less verbatim LLM writing as if it's a post that you wrote.

I am not ruining the Internet. There is literally nothing I or any group of angry mobs could do that would even remotely slow down the decay of the Internet even if we desperately wanted to.

So in fact, I'm not even trying to not ruin the Internet. I don't particularly care if my attitude is not helping or hurting. I'm not having an attitude as part of some grand strategy to save or destroy the internet. I'm having an attitude, because I am pissed off.

And I am pissed off because I am tired of reading posts the author probably only skimmed themselves.

aozgaa · 2026-06-05T13:26:41 1780666001

In general, it seems HN does not like to read llm-generated articles. I ran into this myself when using an llm to edit some stuff I wrote.

At the time, I found this a bit irritating, but with a few weeks time I see the merit. The informational content tends to fall into “derivative” territory when LLM’s write stuff. And people are here for novelty and some socialization.

Also LLM prose seems optimized for engagement rather than concise communication. Takes longer to sift through linguistic boilerplate to get to the point. (The quoted bit being a case in point)

fireflash38 · 2026-06-05T18:41:09 1780684869

Why would anyone spend time reading something that someone couldn't even spend the time to write themselves?

jchw · 2026-06-05T14:23:33 1780669413

I just find it to be utter dreck. It has one of the most agitating prose styles I've ever seen. I would legitimately rather read actual broken English than the cliché polished turds Claude pops out. I am not an LLM hater, I think these tools are pretty impressive and often even useful, but even if I didn't care about the fact that I want to read communication from humans and not robots (and I do care about that, FWIW) I just find the current LLMs are horrid at writing.

And while the comments are always flooded with people like me, the upvotes seem to tell a different story; clearly LLM writing really does appeal to some people. Or idk, maybe a lot of people who vote on stories and don't comment don't actually read them. Hard to say for sure.

grey-area · 2026-06-05T18:36:41 1780684601

I think it’s just people don’t read before voting, they upvote on the headline and then come to discuss it here.

otabdeveloper4 · 2026-06-05T18:27:40 1780684060

I don't even know what "just placement" is.

(I need a better model to translate from llmese.)

grey-area · 2026-06-05T18:37:07 1780684627

Sometimes the things word generators say just don’t make sense.

CuriouslyC · 2026-06-05T13:07:26 1780664846

I'd suggest writing the lead-in yourself and boxing AI prose separately from your prose in the analysis for future articles. You can give the humanized summary/eli5/key points, then have "details according to AI" boxes that go into nitty-gritty. People seem to dislike AI ghostwriting, but most of these people still use AI, so perhaps keeping authorship clear and separate will avoid some of the flak.

logicprog · 2026-06-05T13:22:43 1780665763

This seems fair. Of course, now that I've posted this here once, I doubt it'll get constructive engagement again, but I can at least improve this for the future

bri3k · 2026-06-05T14:05:37 1780668337

Even if everything in the article is true you should not use AI to write this. A analogy would be tobacco company report on how smoking isn’t so bad for you.

ex-aws-dude · 2026-06-05T18:22:55 1780683775

So the original unfounded claim has 400+ comments because its perfect HN ragebait

The author provides evidence to the contrary and the HNers won't even engage with it instead just talking about the writing of the article in classic HN bikeshedding fashion.

How about after that we talk about the formatting of the website and the colors?

This site is really going down hill

Where is the accountability for your own opinions?

Are you guys only upvoting things that confirm your existing gripes?

dang · 2026-06-05T18:35:36 1780684536

Comments like this do more of what they complain about, only with an extra layer of judgment.

It would be preferable if someone would seed a better discussion by engaging with the article's claims/observations.

ex-aws-dude · 2026-06-05T19:44:31 1780688671

Why did you the admin allow such ragebait to stay on the front page then?

Is that the kind of low effort posts we want around here? Just a link to a github comment of a screenshot?

You're complicit here in fueling the harassment of an open source project

dang · 2026-06-05T19:53:28 1780689208

I don't have enough background info to understand what you're referring to here.

Even if you're right, though, you shouldn't be posting comments that break the site guidelines.

dang · 2026-06-05T17:59:40 1780682380

This submission was heavily flagged, presumably because the article sounded like genai. But the article now says the following:

> After posting this on Hacker News and recieving almost no substantive input, discussion, or response on the actual content of the article, I decided to rewrite all of the prose in my own voice.

I've therefore turned off the flags and hopefully people can actually now discuss the claims/findings being reported.

hypfer · 2026-06-05T18:25:22 1780683922

> I decided to rewrite all of the prose in my own voice.

Soo... it didn't just sound like genai but was genai?

___

Huh. From the article:

> If anyone complains about my verbosity or sentence structure — as they usually do, which is the reason I originally let the AI write the prose, among other reasons obsoleted by templating — they can go fuck themselves.

This is kinda sad, honestly. But also should show the author that doing what people try to bully you into doing will not stop them from bullying you.

Just stick with your unique voice man. If people don't want to read that that's fine. They do not have to. You're fine

.. what are those em-dashes doing there though?

ellyagg · 2026-06-05T18:31:30 1780684290

Right so it’s gonna be a litmus test for knowledge workers going forward if they can separate style over substance. Genai tells are style. You have to be able to evaluate the ideas.

dang · 2026-06-05T18:34:07 1780684447

I doubt that you can separate style from substance in that way, because you can't separate writing from thinking.

I agree that it will be interesting to see how this develops going forward. One can imagine wildly varying scenarios.

hypfer · 2026-06-05T18:33:31 1780684411

Hm. Nah. Why?

Why should I care? If it's a good thought, chances are it appears without slop around it. If it doesn't re-appear, life will still go on regardless.

No need to shift through noise just to avoid FOMO.

logicprog · 2026-06-05T19:06:55 1780686415

> .. what are those em-dashes doing there though?

You're literally doing exactly the bullying I was trying to avoid, even while denouncing it. I like em-dashes. I have AuDHD, and they help me represent how I think.

hypfer · 2026-06-05T19:11:07 1780686667

> You're literally doing exactly the bullying I was trying to avoid

Uhm, no. Really just no. And, frankly, I find it shameful that you'd throw such an accusation at me.

But I guess we can stop here.

Idk man. The internet can be a bit too much sometimes. I truly get that, but this was too much from your side.

Wish you all the best.

skeledrew · 2026-06-05T19:36:44 1780688204

Why did you point at the em-dashes? It looks very much as though you're accusing the author of an update that was also generated (possible but they seem sincere enough about wanting honest feedback on the content, and making changes for that). Or you're saying the author - and maybe everyone in general? - should no longer use em-dashes because they're a LLM smell. Yeah I'd feel offended too. It's a real pity I can't find em-dashes on my keyboard, or I'd stick them in this comment.

ajkjk · 2026-06-05T18:34:12 1780684452

The em dashes are fine.

If someone gives them shit about their writing, that's on the critic for being shitty. If they use AI to write, that's on them for being fake. But, to write online at all requires being ready to have people be shitty to you and ideally not reacting in a way that makes the situation worse. Sounds like they need work on that part.

Anyway it is basically always possible for someone to find something legitimately bad about anything a person does. The question is, how much of an issue is that? Not much actually. So you have flaws. Fine, just be flawed. It had no affect on your life beyond your reaction to the attack. And putting aside that reaction is a prerequisite for learning anything useful (or discerning that there is nothing to learn) from the experience.

Good people will trust good intentions through the flaws, while shitty people will write off your work and your intentions because of the flaws (and try to make sure you feel bad about it in the process). But it's always they're too weak to express disagreement maturely, or sometimes because they're bitter and threatened by your good intentions directly. Either way, it's their flaw, not yours.

hypfer · 2026-06-05T18:37:14 1780684634

I don't think that you can successfully dismiss an obvious AI writing marker with

"No these are fine, now look over there!! <lotsoftext>"

Pay no attention to the man behind the curtain?

logicprog · 2026-06-05T19:11:20 1780686680

Great, so I rewrite everything in my own prose, and now it's still "obvious AI writing," just because I'm literate.

ajkjk · 2026-06-05T20:42:35 1780692155

What? You are confused--human beings write em dashes also. Also you're being a dick to the OP, grow up.

otabdeveloper4 · 2026-06-05T18:25:55 1780683955

> I decided to rewrite all of the prose in my own voice

"Claude, rewrite all of the prose in my own voice."

The funny part is that it probably works.

roywiggins · 2026-06-05T12:57:02 1780664222

> A simple distributional analysis of every rsync release with bug data. No model. No assumptions. Just placement.

If you want me to read your analysis, you are going to have to make it not read like Claude wrote it. What does "placement" even mean here?

rroblak · 2026-06-05T13:03:42 1780664622

Yeah, made me chuckle that an LLM— probably Claude— was used to write this.

The use of "regime shift" is what gave it away for me. I've never seen a human write that, but Claude does from time to time.

At least they removed occurrences of "load-bearing".

roywiggins · 2026-06-05T14:04:00 1780668240

"quietly" seems to be the new one recently

genxy · 2026-06-05T18:08:19 1780682899

Ohhh, quietly load-bearing is the real just. No noise. Pure fact. Delivered robustly.

gamegod · 2026-06-05T13:05:47 1780664747

It's the ultimate product for marketers. It inserts itself as an advertisement into every conversation now and defends itself against criticism. Just crazy. There's no hope for the rest of us.