Unmangling CodeWarrior C++ names
|
|
Thread rating:  |
Richard Buckle - 28 Jan 2005 22:16 GMT Hi all,
Can anyone point me to code for unmangling CodeWarrior's C++ function names or, failing that, an up to date specification of the mangling? I googled the web and the newsgroups but didn't find anything that looked remotely up to date.
I'm attempting a Python script that can apply a link map to a crash log, so as to recover the function names and offsets into the functions, for developers who turn off traceback tables in their release builds but archive their link maps. If successful, I'll put it on sourceforge under a BSD-ish licence. I thought it would be nice to unmangle the C++ function names into the bargain.
Any previously invented wheels gratefully received.
Richard.
Eric VERGNAUD - 28 Jan 2005 22:54 GMT dans l'article 280120052216455583%richardb@sailmaker.co.uk, Richard Buckle à richardb@sailmaker.co.uk a écrit le 28/01/05 23:16 :
> Hi all, > [quoted text clipped - 13 lines] > > Richard. There is a web site for that, but I don't remember the url. CW9 follows the common mangling rules described there.
Eric
Richard Buckle - 29 Jan 2005 04:01 GMT > Richard Buckle à richardb@sailmaker.co.uk a écrit le 28/01/05 23:16 : > [quoted text clipped - 7 lines] > > Eric Salut Eric,
/Malhereusement/, without the URL in question your answer doesn't get me anywhere. I have already googled on all the search strings I could dream up [Gallic shrug].
Perhaps you are thinking of the "standard" mangling spec that is set out in Stroustrup "The C++ Programming Language"? I have the "special edition" of that book, and I'm sure that a mangling spec is given in there somewhere (I remember seeing it), but I'm damned if I can find it using such blunt instruments as its table of contents and its index.
Oh well, I'm off to play games now, then sleep. The rest of the Python script is coming along nicely.
Cordialement, Richard.
Miro Jurisic - 29 Jan 2005 05:08 GMT > Can anyone point me to code for unmangling CodeWarrior's C++ function > names or, failing that, an up to date specification of the mangling? I > googled the web and the newsgroups but didn't find anything that looked > remotely up to date. <http://www.codesourcery.com/cxx-abi/>
> I'm attempting a Python script that can apply a link map to a crash > log, so as to recover the function names and offsets into the > functions, for developers who turn off traceback tables in their > release builds but archive their link maps. If successful, I'll put it > on sourceforge under a BSD-ish licence. I thought it would be nice to > unmangle the C++ function names into the bargain. The CW mangling scheme is largely supported by c++filt (man c++filt), so if you are willing to accept its problems (specifically, that it doesn't understand some complex template functions, and leaves them mangled), you maybe able to get away with zero work on unmangling.
c++filt is used by crashreporter, too, so you will not be doing any worse than it does.
meeroh
 Signature If this message helped you, consider buying an item from my wish list: <http://web.meeroh.org/wishlist>
Richard Buckle - 29 Jan 2005 06:10 GMT > > Can anyone point me to code for unmangling CodeWarrior's C++ function > > names or, failing that, an up to date specification of the mangling? I [quoted text clipped - 14 lines] > c++filt is used by crashreporter, too, so you will not be doing any worse than > it does. Many thanks, Miro! As always, you're a gentleman and a scholar.
Good enough for crashreporter is good enough for my humble Python script. Once I get it reasonably stable, I'll pop it up on sourceforge so that people can enhance it as needed. With a fair wind, I might even be able to wrap a basic Cocoa GUI around it.
Richard.
PS I append my proposed CLI for general criticism:
~~~~~~~~~~~~~~~~ decodecrashlog [-usage | -help | -h | -?] decodecrashlog -m linkmap_file -o offset_num [-v verbosity] crash_log_file
-usage prints the usage and exits, ignoring all other switches and args before and after it. If invalid switches and/or arguments are supplied, or none at all, then -usage is assumed.
-h is a synonym for -usage
-help is a synonym for -usage
-? is a synonym for -usage
-m specifies the link map
-o is a number specifying the offset that must be added to the addresses in linkmap_file to arrive at offsets in crash_log_file. It follows the usual conventions for C literals, i.e. "123" is decimal, "0123" is octal and "0x0123" is hexadecimal. Future versions may be able to deduce this, especially if the author had the foresight to force a traceback entry for main() using "#pragma traceback on".
-v optional, specifies verbosity of output to stderr, currently ignored, to be defined.
crash_log_file is the file that should be enhanced. The enhanced copy is emitted to stdout. If crash_log_file is not supplied, it is taken from stdin.
decodecrashlog returns 0 in success and non-zero in failure. [failure returns to be defined.] ~~~~~~~~~~~~~~~~
Miro Jurisic - 29 Jan 2005 07:20 GMT > PS I append my proposed CLI for general criticism: > > ~~~~~~~~~~~~~~~~ > decodecrashlog [-usage | -help | -h | -?] > decodecrashlog -m linkmap_file -o offset_num [-v verbosity] > crash_log_file Multi-character option names should be prefixed with --, not -. See the getopt_long manpage for more information. In short, most command line utilities (notable exceptions, sadly, often come from Apple) have standardized on this form, and you'd do well to follow it yourself.
Also, -h and --help are the usual help switches; --usage would probably just be redundant.
This looks like a very useful tool, btw. Let us know when you post it on sf.net
:-) meeroh
 Signature If this message helped you, consider buying an item from my wish list: <http://web.meeroh.org/wishlist>
Richard Buckle - 30 Jan 2005 03:35 GMT > > PS I append my proposed CLI for general criticism: > > [quoted text clipped - 12 lines] > just be > redundant. Points taken. Actually, Python's "optparse" module had led me to similar conclusions. Also, it's persuaded me that required arguments should not be switches.
I'm now contemplating: ~~~~~~~~~~~~~~~~~~~~~ decodecrashlog [--help | -h] decodecrashlog [-v verbosity] [-o offset_num] linkmap_file crash_log_file
-h prints the usage and exits, ignoring all other switches and args before and after it. If invalid switches and/or arguments are supplied, or none at all, then -h is assumed.
--help is a synonym for -h.
-v specifies verbosity of output to stderr, one of DEBUG, INFO, WARNING, ERROR, CRITICAL. Default is CRITICAL.
-o is a number specifying the offset that must be added to the addresses in linkmap_file to arrive at offsets in crash_log_file. It follows the usual conventions for C literals, i.e. "123" is decimal, "0123" is octal and "0x0123" is hexadecimal. If not supplied, the program will try to deduce it, assuming that the author had the foresight to force a traceback entry for main() using "#pragma traceback on". If the program cannot deduce it, it will print a message to stderr simply echo crash_log_file verbatim to stdout.
linkmap_file specifies the link map.
crash_log_file is the file that should be enhanced. The enhanced copy is emitted to stdout. If crash_log_file is not supplied, it is taken from stdin.
decodecrashlog returns 0 on success and arbitrary non-zero values on failure, in which case you should examine the output to stderr to discern the cause of the failure.
~~~~~~~~~~~~~~~~~~~~~
> This looks like a very useful tool, btw. Let us know when you post it on sf.net > :-) Will do. It's the usual case of "I wrote it because I need it".
Richard.
Miro Jurisic - 30 Jan 2005 04:51 GMT > > This looks like a very useful tool, btw. Let us know when you post it on > > sf.net > > :-) > > Will do. It's the usual case of "I wrote it because I need it". For me it will be a case of "I managed to do without it long enough that someone else wrote it before I had to" :-)
meeroh
 Signature If this message helped you, consider buying an item from my wish list: <http://web.meeroh.org/wishlist>
Richard Buckle - 30 Jan 2005 04:17 GMT Some interesting results with c++filt and Python.
1. Quote your mangled names -------------------------------- You need to quote mangled names because otherwise the shell will try to interpret characters such as '<' and '>':
% c++filt __nw__20CObjectStore<60,300>FUl tcsh: 60,300: No such file or directory. % c++filt "__nw__20CObjectStore<60,300>FUl" CObjectStore<60,300>::operator new(unsigned long) --------------------------------
2. Aggregate your calls to c++filt -------------------------------- I'm testing against the link map for Parlance -- a medium-sized app having just under 8000 functions.
Having Python spawn a separate c++filt process for each for each function name utterly crucifies performance. For an *enormous* performance win, concatenate a whole load of mangled names with "\n", pass the aggregated string to c++filt, and then parse the lines you get back.
*BUT*, if you try to send too long a string, the buffers will silently drop data (I don't know whether it's Python's buffers, the shell's or the OS's), so you *must* test that you got back as many names as you sent.
Empirically, sending 4096 names consistently succeeds, whereas sending 8192 names consistently fails. The governing factor is, I suppose, the total length of the command string. For robustness, I've given my script an adaptive algorithm that tries descending powers of two, starting with 4096. You'll see the code when I put it up on sourceforge. --------------------------------
HTH some future googler, Richard.
Thorsten Froehlich - 29 Jan 2005 13:53 GMT > Hi all, > > Can anyone point me to code for unmangling CodeWarrior's C++ function > names or, failing that, an up to date specification of the mangling? I > googled the web and the newsgroups but didn't find anything that looked > remotely up to date. Assuming you talk about Mach-O, the mangling is gcc compatible and follows the common gcc ABI specification. You can find it at <http://www.codesourcery.com/cxx-abi/abi.html>.
Thorsten
Richard Buckle - 30 Jan 2005 04:19 GMT > > Can anyone point me to code for unmangling CodeWarrior's C++ function > > names or, failing that, an up to date specification of the mangling? [quoted text clipped - 6 lines] > follows the common gcc ABI specification. You can find it at > <http://www.codesourcery.com/cxx-abi/abi.html>. Thanks, Thorsten, that's a useful link to have. However I'm going with Miro's suggestion to call c++filt.
Regards, Richard.
|
|
|