Discussion:
Understanding a JS binary file format
Gábor Molnár
2014-02-09 13:50:01 UTC
Permalink
Hello all,

I'd like to understand a binary file format used to store JavaScript code,
and would appriciate any help. The application I'm looking at is using some
older version of SpiderMonkey as embedded JS engine, and stores it's
initialization JS code in a binary file. The file format itself looks
somewhat similar to the binary JS files found in StartupCache files of
Thunderbird. It's a serialization format for the SpiderMonkey JavaScript
AST, I guess.

Is this file format documented somewhere? If not, where should I look in
the mozilla source code to find out more about the
serialization/deserialization process? Do you think that SpiderMonkey could
somehow turn this AST-like representation back into JS code?

Thanks,
Gábor
Till Schneidereit
2014-02-09 22:51:33 UTC
Permalink
Hi Gábor,

I don't know which exact version you're using, so I can't direct you to
specific files, but I'd assume that your application is using XDR. In
current mozilla-central, the XDR-serialization is implemented in
js:XDRScript[1], so looking at that function in your version's source
should give you a good starting point.


cheers,
till


[1]: http://mxr.mozilla.org/mozilla-central/source/js/src/jsscript.cpp#443
Post by Gábor Molnár
Hello all,
I'd like to understand a binary file format used to store JavaScript code,
and would appriciate any help. The application I'm looking at is using some
older version of SpiderMonkey as embedded JS engine, and stores it's
initialization JS code in a binary file. The file format itself looks
somewhat similar to the binary JS files found in StartupCache files of
Thunderbird. It's a serialization format for the SpiderMonkey JavaScript
AST, I guess.
Is this file format documented somewhere? If not, where should I look in
the mozilla source code to find out more about the
serialization/deserialization process? Do you think that SpiderMonkey could
somehow turn this AST-like representation back into JS code?
Thanks,
Gábor
_______________________________________________
dev-tech-js-engine mailing list
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Gábor Molnár
2014-02-10 10:06:30 UTC
Permalink
Thanks for the suggestion! Based on the source code and the magic number at
the beginning of the file (0xdead0007 in little endian encoding), I've
identified it as a SpiderMonkey 1.8 XDR file.

If my understanding is correct, an XDR file is not an AST storage format
but rather stores the state of a script at a given time. Assuming this, do
you think it is possible to reconstruct the whole JS file, or at least
extract the JS code of the contained functions?

Gábor
Post by Till Schneidereit
Hi Gábor,
I don't know which exact version you're using, so I can't direct you to
specific files, but I'd assume that your application is using XDR. In
current mozilla-central, the XDR-serialization is implemented in
js:XDRScript[1], so looking at that function in your version's source
should give you a good starting point.
cheers,
till
[1]: http://mxr.mozilla.org/mozilla-central/source/js/src/jsscript.cpp#443
Post by Gábor Molnár
Hello all,
I'd like to understand a binary file format used to store JavaScript code,
and would appriciate any help. The application I'm looking at is using some
older version of SpiderMonkey as embedded JS engine, and stores it's
initialization JS code in a binary file. The file format itself looks
somewhat similar to the binary JS files found in StartupCache files of
Thunderbird. It's a serialization format for the SpiderMonkey JavaScript
AST, I guess.
Is this file format documented somewhere? If not, where should I look in
the mozilla source code to find out more about the
serialization/deserialization process? Do you think that SpiderMonkey could
somehow turn this AST-like representation back into JS code?
Thanks,
Gábor
_______________________________________________
dev-tech-js-engine mailing list
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Gábor Molnár
2014-02-10 10:23:48 UTC
Permalink
Okay, I think I've found what I need: using the js_DecompileScript()
function should do the trick. Thanks for the help again!

Gabor
Post by Gábor Molnár
Thanks for the suggestion! Based on the source code and the magic number
at the beginning of the file (0xdead0007 in little endian encoding), I've
identified it as a SpiderMonkey 1.8 XDR file.
If my understanding is correct, an XDR file is not an AST storage format
but rather stores the state of a script at a given time. Assuming this, do
you think it is possible to reconstruct the whole JS file, or at least
extract the JS code of the contained functions?
Gábor
Hi Gábor,
Post by Till Schneidereit
I don't know which exact version you're using, so I can't direct you to
specific files, but I'd assume that your application is using XDR. In
current mozilla-central, the XDR-serialization is implemented in
js:XDRScript[1], so looking at that function in your version's source
should give you a good starting point.
cheers,
till
http://mxr.mozilla.org/mozilla-central/source/js/src/jsscript.cpp#443
Post by Gábor Molnár
Hello all,
I'd like to understand a binary file format used to store JavaScript code,
and would appriciate any help. The application I'm looking at is using some
older version of SpiderMonkey as embedded JS engine, and stores it's
initialization JS code in a binary file. The file format itself looks
somewhat similar to the binary JS files found in StartupCache files of
Thunderbird. It's a serialization format for the SpiderMonkey JavaScript
AST, I guess.
Is this file format documented somewhere? If not, where should I look in
the mozilla source code to find out more about the
serialization/deserialization process? Do you think that SpiderMonkey could
somehow turn this AST-like representation back into JS code?
Thanks,
Gábor
_______________________________________________
dev-tech-js-engine mailing list
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Till Schneidereit
2014-02-10 21:27:46 UTC
Permalink
Post by Gábor Molnár
Okay, I think I've found what I need: using the js_DecompileScript()
function should do the trick. Thanks for the help again!
Great. Note that this won't work anymore if you ever update to current
versions of SpiderMonkey: most of the decompiler has been removed. I'm
pretty sure that that'd be the least of your worries during the updating
process, though. ;)

Nicolas B. Pierron
2014-02-10 10:14:25 UTC
Permalink
Hi Gábor,
Post by Gábor Molnár
I'd like to understand a binary file format used to store JavaScript code,
and would appriciate any help. The application I'm looking at is using some
older version of SpiderMonkey as embedded JS engine, and stores it's
initialization JS code in a binary file.
I am kind of surprized, by default the JS engine is not doing any
serialization on its own, unless you called JS_EncodeScript /
JS_DecodeScript your-self when you embed SpiderMonkey.
Post by Gábor Molnár
The file format itself looks
somewhat similar to the binary JS files found in StartupCache files of
Thunderbird. It's a serialization format for the SpiderMonkey JavaScript
AST, I guess.
I do not know about the StartupCache files of Thunderbird, but for XUL
scripts (JavaScript embedded in XUL), we do use JS_EncodeScript and
JS_EncodeFunction, to save a pre-parsed version.

We do not save the AST, but the bytecode including the meta-data of the
JSScript.
Post by Gábor Molnár
Is this file format documented somewhere? If not, where should I look in
the mozilla source code to find out more about the
serialization/deserialization process? Do you think that SpiderMonkey could
somehow turn this AST-like representation back into JS code?
There is no documentation of this process as it is not really exposed. All
these serializations are handled by XDR* functions. These functions might
be a bit weird, as they contain both the code to serialize and deserialize,
based on the template parameter.

We used to have a decompiler, which was reconstructing the source based on
the source notes and the content of the bytecode. But it got removed as
this was hard to maintain. The decompiler was removed about 1.5y ago.
--
Nicolas B. Pierron
Loading...