Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Editing XML
- Date: Thu, 19 Jun 2014 22:57:41 +0900
- From: Brian Chandler <brian@example.com>
- Subject: Re: [tlug] Editing XML
- References: <53A1E0CF.7050004@imaginatorium.org> <53A22D21.2060707@extellisys.com>
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
On 2014-06-19 09:21, Travis Cardwell wrote:On 2014年06月19日 03:56, Brian Chandler wrote:So the question is: What is the right tool? InDesign files (haven't actually managed to see one yet) appear to be xml,InDesign files (INDD) are not XML, but the document can be exported to IDML (InDesign Markup Language [1]) or XLIFF (XML Localisation Interchange File Format [2]), which are XML. Memsource apparently supports IDML.OK -- basically the havoc has already been wreaked by the time the XLIFF file has been made, with its inappropriate segmentation... But looking at this:http://wiki.memsource.com/wiki/MemSource_Cloud_User_Manual#SegmentationThe only customisation mentioned is all about stopping abbreviations like approx. from splitting this sentence into two. Looks like a desperate hack to me. OTOH, there must be customization to break on Japanese 'maru'... so this is something else to ask about.The question is whether there is some other generic framework for cracking the text out of (specifically .idml) xml files for translation, in an intelligent and flexible way, capable of helping automation, rather than hindering it. For example, one global replace, something like (imagined example): s/<char-special type='maru-suuji' value=$N>/($N)/ ... would replace every circled number by the appropriate (n), supposing that this is the design decision. To do this in Memsource effectively means that every single numeral will be retyped, errors will occur, etc etc. COST.sed! :)Well, not exactly sed, because the need is for something xml-aware, which for example will replace 黒 by "black", but only inside <content> tags.Translate Toolkit [3] is a set of utilities written in Python and easy to hack (that has saved me considerable effort in software translation projects). Though it does not support IDML, it supports XLIFF.Right; XLIFF is post-havoc. The fundamental problem is that the L24n industry has not yet noticed you need to localise format as well as text, and this could be done systematically, just like the text.Thanks! (for the other responses too) Brian Chandler
- Follow-Ups:
- Re: [tlug] Editing XML
- From: Jean-Christophe Helary
- References:
- [tlug] Editing XML
- From: Brian Chandler
- Re: [tlug] Editing XML
- From: Travis Cardwell
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Ubuntu 10.04 - kernel update snafu
- Next by Date: Re: [tlug] Ubuntu 10.04 - kernel update snafu
- Previous by thread: Re: [tlug] Editing XML
- Next by thread: Re: [tlug] Editing XML
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links