Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Okay, but if the unminified code doesn't match the minified code (as noted at the end "it looks like LLM response overlooked a few implementation details"), that massively diminishes its usefulness — especially since in a lot of cases you can't trivially run the code and look for differences like the article does.

[ed.: looks like this was an encoding problem, cf. thread below. I'm still a little concerned about correctness though.]



You need to use another tool to do the actual renames, like HumanifyJS does:

https://github.com/jehna/humanify


It does seem that the unminified code is very close to the original. In some cases ChatGPT even did its own refactoring in addition to the unminification:

    // ORIGINAL:
    j.useEffect(() => {
        function r() {
            n({ height: window.innerHeight, width: window.innerWidth });
        }
        if (typeof window < "u") return n({ height: window.innerHeight, width: window.innerWidth }), window.addEventListener("resize", r), () => window.removeEventListener("resize", r);
    }, []),

    // UNMINIFIED:
    useEffect(() => {
      const handleResize = () => {
        setSize({ height: window.innerHeight, width: window.innerWidth });
      };

      // Initial size setting
      handleResize();

      window.addEventListener('resize', handleResize);
      return () => {
        window.removeEventListener('resize', handleResize);
      };
    }, []);
Note that the original code doesn't call `handleResize` immediately, but have its contents inlined instead. (Probably the minifier did the actual inlining.) The only real difference here is a missing `if (typeof window < "u")` condition.


the condition is a constant so it can be safely removed


Only in the web environment. In fact the condition itself is true only when it runs in a web browser and not in a web worker.


which is the case for that code and it was added by the obfuscator


No obfuscator would add only that. It is almost surely from some library that is aware of the possibility that `window` may not exist.


This refers to the fact that ChatGPT generated version is missing some characters that are used in the original example. Namely, [looks like HN does not allow me to paste unicode characters, but I am referring to the block characters] can be seen in their version, but cannot be seen in the ChatGPT generated version. However, it very well might be that it is simply because I didn't include all the necessary context.

Discrediting the entire output because a few missing characters would be very pedantic.

Otherwise, the output is identical as far as I can tell by looking at it.


It's because the author miscopy-pasted the original code: those "â–‘â–’â–“â–ˆ" at the end of the O5 string are supposed to be the block characters. E.g. "â–‘" in Windows-1252 [0] is 0xE2 0x96 0xE2 which, in UTF-8, exactly the encoding for U+2592 MEDIUM SHADE [1].

[0] https://en.wikipedia.org/wiki/Windows-1252#Character_set

[1] https://www.compart.com/en/unicode/U+2592


Possible that this is the mistake.

However, I don't think I miscopied the original code.

https://reactive.network/assets/index-8b4ef4ac.js

If you look for `oahkbdpqwmZO0QLCJUYXzcvunxrjft` in the output, you should see that those characters appear exactly like that. Maybe an issue with encoding of the script file?


Most definitely; if I use "View >> Repair Text Encoding" in Firefox, it shows the block characters. But I have to admit, it's strange that Firefox does not choose UTF-8 by default in this case.


Yes, turns out I was the one who made the mistake.

I updated the article to reflect the mistake.

> Update (2024-08-29): Initially, I thought that the LLM didn’t replicate the logic accurately because the output was missing a few characters visible in the original component (e.g., ). However, a user on HN forum pointed out that it was likely a copy-paste error.

>

> Upon further investigation, I discovered that the original code contains different characters than what I pasted into ChatGPT. This appears to be an encoding issue, as I was able to get the correct characters after downloading the script. After updating the code to use the correct characters, the output is now identical to the original component.

>

> I apologize, GPT-4, for mistakenly accusing you of making mistakes.


If no character set is specified, plain text content is assumed to be 1252. This probably extends to application/javascript as well but I'd have to check to be sure.

The web pre-dates utf-8, although not by much. Ken Thompson introduced utf-8 at winter Usenix in 1993 and CERN released the web in April, but it would be several more years before utf-8 became common. The early web was ISO 8859-1 by default. But people were pretty lazy about specifying character sets back then (still are actually) and Microsoft started sending or assuming their 1252 character set where 8859-1 was required by the spec. Eventually the spec was changed to match de facto behavior. I guess the assumption was that if you're too stupid or lazy to say what character set you're using, then it's probably 1252. (Today the assumption would be that it's probably utf-8). I'm not sure what the specs say today, but I think html is assumed to be in utf-8, and everything else is assumed to be 1252 (if the character set is not explicitly declared).


He also told it to reimplement from JavaScript to TypeScript.

I would guess if he just told it to rename the variables and method first, it would have been closer to the original.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: