IMSLP:Programmer Portal

Contents

Introduction

Currently this page is for IMSLP programmers to discuss various issues randomly. In the future it is probably a good idea to make this into a coherent portal like the other ones, and maybe separate documentation from FAQs. But that's in the future.

Don't post huge chunks of code here so as not to clutter up the page, but feel free to post a line or two if necessary.

General Observations

Setting up a local site

The following is a rough sketch of the steps through which Leonard Vertighel managed to set up a (mostly) working site. There may well be better solutions.

  1. Prerequisites: lighttpd with PHP5 as a CGI module, MySQL, php5-xcache.
  2. Install MW 1.15 using the standard install procedure.
  3. cp -u -r /code/from/svn
  4. On line 4 of index.php, change the absolute path to relative (unless it happens to match your setup).
  5. Append require_once( "{$IP}/extensions/IMSLPSettings.php" ); in file LocalSettings.php.
  6. In file extensions/IMSLPSettings.php, remove/comment the variables $wgArticlePath and $wgUsePathInfo.
  7. If you run into problem with the captcha mechanism, you can disable it by setting both $wgAutoConfirmAge and $wgAutoConfirmCount$ to 0 in extensions/IMSLPSettings.php.

Committing

  • Use the standard log format (prefix each line with * BUG: (bug), * FEAT: (feature) or * MISC: (for everything else)).

Questions for Feldmahler

Translation System for Special Pages and FTE templates

Leonard asked me about the structure of the code with regards to Special Page text translations, and here is my response:

Before I start, I have to say that I'm saying all of this from memory, so it may be inaccurate. However, it should help to orient you hopefully. Unfortunately I do not have time currently to actually investigate what goes on in the code, so all I can provide is what I remember goes on in the code, with the possibility that either I am inaccurate or that I've changed the code myself later on and forgot I did so.
All IMSLP translations go through IMSLPGetTranslations() or fmwGetTranslations() (the latter is a wrapper for the former I believe, or the other way around). The function itself is in one of the general "bag of stuff" PHP files I believe (there shouldn't be many of those anyhow). All IMSLP translation issues (including not only Special Page translations but also FTE translations) arises from that function, so the culprit is always that function (which isn't all that long).
What IMSLPGetTranslations() does is to recursively get the base translation (i.e. English page without /<lang>) and then merge the actual translation page (if it exists) into the base translation, such that untranslated words will appear as the base translation (which means we don't get message nonexistence issues). Its interaction with the MediaWiki message system should be limited to using wfMsg() or wfMsgforContent() (one of them automatically selects the translation and the other only returns the default page I believe, you can find all of the wfMsg* functions in GlobalFunctions.php). I may be wrong about its usage of wfMsg() (or the wrapper equivalent in MWAPI).
One special thing only for Special Pages: you see a hardcoded list of defaults, which exists to prevent uninitialized usage of the message array (i.e. so we make sure there exists a default value). The other reason was that I was to lazy to move the translations to the wiki... Otherwise, IMSLPGetTranslations() should not treat SpecialPages any different from FTE templates.
P.S. Leonard: Feel free to add the process you did to set up an offline duplicate of IMSLP in the "General Observations" section above, if you have time that is of course. --Feldmahler 16:04, 29 October 2009 (UTC)
I'm still not sure which would be the preferred way to internationalize those messages of the file upload form which are not yet translatable. These messages are hardcoded in various arrays (for example, the values in the File Type selection box, as well as several others). How bad would the performance impact be if IMSLPGetTranslations() was called for each individual array (half a dozen or so)? Should I try to rewrite the code so that a single call to IMSLPGetTranslations() is sufficient (this seems to require more extensive changes)? Or do you have an altogether different solution?
Oh sorry! I thought your question was only about the bug and forgot about your other questions. A note before I address the other questions that always keep caches in consideration; it may well not be a "bug", but rather a result of XCache. Clear XCache variable cache frequently during testing to make sure that is not the problem (I've been bitten by this many many times unfortunately). Or even better, disable XCache variable cache (set to 0M) on your test setup so that nothing is cached anyway.
Absolutely do not consider performance when fixing SpecialPage code. Because of the extreme rarity with which they are called, no amount of normal PHP code (except, of course, an endless loop or something similar) could slow down the server to any noticeable degree. But this means that since we do not worry about performance, we should worry about good programming design instead. (Note also that the performance non-issue does not extend to functions shared with other intensively used IMSLP features. IMSLPgetTranslations is a good example of such a function.)
Regarding the non-translation issue, I would keep all special page translations on a single message page, so that we don't have to search everywhere for stuff (an exception is obviously made for substantial shared messages, which I don't remember to exist right now but may in the future). Remember that the message page can be annotated; all lines not starting with an asterisk (*) will be ignored. I would rather that you separate different translation sections by annotating the page rather than having multiple message pages which would add to the confusion.
However, in order to implement the dropdown boxes in such a case, you will likely need a "translation array", which specifically pulls out elements from the global translation and sticks them into this array that is then used to create the dropbox. A future consideration may be to wikify the dropdown boxes using separate translation pages (such that non-programmers can change the content of the boxes), but I strongly oppose doing this at the moment, as we do not want to generate confusion especially with the pressure to redo the categorization system (and so we don't have admins caving in and making the system even more messy). --Feldmahler 02:50, 30 October 2009 (UTC)
While I'm asking: I think it would be much better if the explanations for the upload form, rather than being in a separate column, appeared directly above each individual input field, with a JS show/hide toggle. Is this feasible in such a way that the text (both English as well as the translations) remains wiki-editable? --Leonard Vertighel 17:19, 29 October 2009 (UTC)
One thing I will say is that I know next to nothing about JS itself, and one of IMSLP's design goals was to make sure that people without JS can use IMSLP just fine (like me lol). However, considering the commonness of Javascript, it may be acceptable to design in a way that does not preclude using the site without Javascript, but does add additional features via Javascript. Graceful fallback is always recommended however (for example, in this case perhaps the explanations could show automatically in the absence of JS).
Regarding this specific case, I would encourage creative thinking (well, actually this is encouraged in all situations). For example, it is very feasible to simply move all the descriptions onto the message page, and that way they are editable by all admins. That way we can have this feature with minimal additional code (and without bad program design to boot). Formatting may be lost, however, unless there is a standard code-mandated format (for example, a description blurb + an example, which seems to me to be a perfectly acceptable format). --Feldmahler 02:50, 30 October 2009 (UTC)
The idea is of course to have all the text visible by default, and then hide it via JS, so that advanced users who already know the instructions by heart can have a much more compact form which requires no scrolling. I completely agree that all parts of the site must be usable without JS.
The reason why I didn't immediately suggest to put the instructions in a MediaWiki message is because I remember you saying that due to performance reasons, there should be no English versions of the messages. But since in this case performance is not an issue, it clearly seems like the best solution. --Leonard Vertighel 07:59, 30 October 2009 (UTC)
I don't remember saying that, but in this case it is not a concern either way. For FTE templates, the English versions are in the wiki, even though FTE is probably the most performance-intensive you will get for translations (however they are cached, so that lessens the issue). Regardless of what I said before, having English defaults on the wiki is not a significant performance hit, because the function was designed with this in mind.
Oh, and by the way, wouldn't it make much more sense if the thumbnail and preview files could be uploaded directly from the Add Files form? (It was even suggested on the forums to add functionality to generate them directly on the server, but that would be quite a bit more involved.) --Leonard Vertighel 13:04, 30 October 2009 (UTC)
The only reason to the contrary I can think of is that a fairly significant amount of work was actually put in to make the file upload thingy work correctly without JS in the first place (for example, we do not want people to reupload the file if they forgot to fill out another form field), and so I'm not sure extending this functionality to previews and thumbnails will be completely easy and painless. In addition, I think this is sufficiently rare that I don't mind the contrary. This one is completely your call however.
Just so I understand: why would there be any difference at all between uploading a PDF file and a PNG file at a time on the one hand, and uploading 20 PDF files at a time (multi-file upload form) on the other hand? It appears to me that the problems in both cases should be very much the same. Basically, we would be turning the single-file upload form into a 3-file upload form, which is still less than the existing 20-file upload form. --Leonard Vertighel 07:09, 31 October 2009 (UTC)
There are a few PDF specific hardcoded (due to my lazyness) lines (the most notable probably is the PDF extension restriction). The reason I say you hold off on this for the time being is because this feature is quite global, and you need a better grasp of the code. For example, if I were to implement this, I'd look through the entire UploadAPIWrapper class (or whatever it is called in MWAPI.php) to make sure that all the code is compatible with non-PDF uploads (it was written only with PDF in mind). And because UploadAPIWrapper extends UploadForm, you will need to check out the Mediawiki UploadForm too. This messiness and dependency is why I suggest that you leave this for later when you have a better feel for the code.
However, this is only a suggestion. If you think you can handle researching UploadAPIWrapper and UploadForm, and know the intricacies of error handling in SpecialAddFile, go ahead. However, I do strongly suggest that you do more localized coding first (i.e. the UI/JS improvements I mentioned before), so that I can correct any stylistic issues you have, if nothing else. --Feldmahler 15:51, 31 October 2009 (UTC)
Automatically generated (in addition to automatic page count etc) thumbnails/previews does seem on the face to be a good idea, but the reason I haven't been wholly enthusiastic is because this is a feature that requires huge amounts of testing. Processing any image file is very error-prone, and especially if it is as complex a file format as PDF. A test would be to take something like 10,000 files and run them through the code to make sure there aren't any crashes or malformed images. All in all, this is a project with a huge possibility of something going terribly wrong (think the new Mozart complete edition website, which crashes regularly because someone thought it'd be a cool idea to make the server create PDF files on the fly, which is really a horrible idea).
It's not something I'm going to tackle soon, however "printing" a rasterized image of a given PDF file shouldn't be too error-prone (in theory). Extracting images would clearly be much more problematic, but that would work only for scans anyway. But since we are only aiming for screen resolution anyway, resampling shouldn't be a problem. --Leonard Vertighel 07:09, 31 October 2009 (UTC)
You vastly underestimate the number of things that can go wrong, my friend... ;-) Creating PDF files from sterilized JPGs would seem easier, but the NMA site crashes anyway. I'm fairly sure rasterization is more errorprone than extracting images. --Feldmahler 15:51, 31 October 2009 (UTC)
My suggestion is that you hold off on "fancy" improvements for at least the first few weeks (or months) so that you can get a feel for the code and the submission process first. I would encourage you to prioritize relatively local changes (like the translation bug and JS field descriptions), and leave the harder and more global changes for later (like the thumbnail upload, and more so, the automatic generation stuff). I would especially prioritize user interface changes, because that is an area I simply cannot do anything about, and most of those changes are local, and thus less prone to error especially since you aren't familiar with the code yet.
I just submitted a bugfix for the broken internationalization of the Add Work form. --Leonard Vertighel 15:31, 30 October 2009 (UTC)
Congrats on your first commit! :-) A few words of advice for the future:
  1. Try to use the standard log format (prefix each line with * BUG: (bug), * FEAT: (feature) or * MISC: (for everything else)).
  2. Try to aggregate at least a few lines of changes per commit, so that we don't have too many commits (for example, perhaps you can fix the translation problems with the other SpecialPages too, if they have the bug of course).
  3. Watch out for empty wrapper functions. For example, one of fmwGetTranslations and IMSLPGetTranslations is a wrapper function for the other. If you find (using grep) that nothing in the code uses the wrapper function, remove the function entirely so that it does not clutter the code or create confusion in the future. I am not saying this is applicable here, but just for the future.
  1. Now you should have put this under General Observations above ;) I'll clean up this page a little later on.
  2. It's an isolated bug (the other two "Add Something" forms are OK), so it appeared reasonable to commit it separately, since I don't know when I'll finish the next modification. I'll hold back in the future.
  3. I'll check. --Leonard Vertighel 07:09, 31 October 2009 (UTC)
Use discretion when holding back. My words were just suggestions. --Feldmahler 15:51, 31 October 2009 (UTC)
I do not generally commit single line edits to the server, but rather wait for a substantial amount of changes unless too much time passes without such. Don't worry about when I will commit to the server; keep focusing on the local copy you have yourself. I will ask for your consent before the first commit to server.
Also a general note that you will most likely find grep of immense value to you at least at first, for finding functions. For example, you can do grep -iR 'function IMSLPGetTranslations' * to find where that function is defined. --Feldmahler 23:43, 30 October 2009 (UTC)
P.S. For future reference, please ask questions about different aspects of the code in different sections on this page so that we don't have discussion about everything tangled into one long section. Don't worry about this one. --Feldmahler 23:43, 30 October 2009 (UTC)
Thanks for the hint, actually I've already been using grep to figure out the code. As for the discussion, I'll try my best, but in these things I'm somewhat chaotic by nature... --Leonard Vertighel 07:09, 31 October 2009 (UTC)

Tips for Debugging

Just an observation I make sua sponte: Feel free to use the logWrite() function (or its equivalent in the circumstances like cspErrorExit(), though you can use logWrite() anywhere) to spit out variable values into the log file defined in LocalSettings.php. Remember, print_r( $variable, TRUE ) allows you to spit out a formatted version of contents for arrays and even objects.

Of course you can use a fancier debugging system with stepping and breaks and everything, but in most circumstances using logWrite() is enough.

Also remember, however, to clearly mark all debug entries such that you will remove them before committing to SVN. --Feldmahler 16:04, 29 October 2009 (UTC)