Talk:ᐊᒥᖅ/amiq
From Wikipedia
What the hell just happened? Where is the main page? Where is the main page disccison?
- It is at Talk:ᐊᒥᖅ. See, what's the problem with using "/" as separator. --Johannes Rohr 11:31, 23 February 2007 (UTC)
Contents |
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] sex category image
can we clear this up once and for all. is the current image in accordance with wikipedia policy?
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] page name
removed the slash, maybe this will be better bacsue of the slash problem, or maybe with a space?
- I suggest parentheses, the way it's done in en. Delldot 23:06, 27 February 2007 (UTC)
- Parenthases wont work, English is only written in the Latin Alphabet, the latin equivalent of ᐅᐃᑭ (wiki) is uiki, this isnt a transliteration, in the North West Territories Inuktitut is written only in Latin Letters and in Quebec Labrador and Nunavut it is only written in Canadian Aboriginal Syllabics. Furthermore «ᐅᐃᑭ (uiki)» makes uiki look less important, would «ᐅᐃᑭ_uiki» work or «ᐅᐃᑭ uiki» or «ᐅᐃᑭuiki» or «ᐅᐃᑭ-uiki» ? So English does not indeed do it that way.
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] Protection
Should the main page be protected? I thought about doing it but I'm not sure. What do you all think?Qrc2006 22:55, 1 March 2007 (UTC)
- Taking into account the latest incident, semi-protection might not be a bad idea. --Johannes Rohr 10:02, 8 March 2007 (UTC)
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] Dual language
I don't think having two transliterated versions for every piece of text in the Wiki is particularly feasible in the long run - and it impedes legibility and, frankly, looks kind of hideous. Could we not have a automated transliteration in the style of the Serbian wikipedia? an syllabics<->latin conversion script would be easy enough to write... 207.112.44.6 01:57, 8 March 2007 (UTC) (en:User:Moszczynski))
- I strongly support this idea. A solution similar to the Serbian Wikipedia, where you can switch the alphabeth by clicking "latinica" or "ћирилица" would be ideal. In Serbian, you can also set your preferred language variant (Cyrillic or Latin) in your personal settings.
- This is so much better than having both scripts mixed everywhere. While for stubs with just a few words, this mix is not such a big issue, it is clearly not an acceptable solution for real articles like ᑲᓇᑕᒥ ᐃᓄᐃᑦ ᐅᖃᐅᓯᖏᑦ/Kanatami Inuit Uqausingit.
- The question is,
- whom should we contact for assistance?
- Is there a simple algorhythm for auto-conversion, given that there is no one-to-one correspondence between the letters of both scripts (unlike Serbian, where this is pretty straightforward)?
- Who might be capable of implementing a solution
- What would be required to make it work? For instance, I suppose that strings in languages other than Inuktitut would have to be marked as non-convertable, because auto-converting English words into Syllabics would result in unreadable garbage.
- What strategy would be best to prepare such a solution? I suppose that the current mix of scripts will make a transition more difficult and would have to be sorted out first. All articles and system messages would have to be fixed to contain only one variant of the language.
- If markers are needed for non-Inuktitut texts, how would the talk pages be handled? They are mostly, if not exclusively in English. Would participants be required to mark in which language they have written, in order to prevent broken auto-conversion? --Johannes Rohr 12:00, 8 March 2007 (UTC)
-
- To answer your questions, the algorithm is in fact rather simple, there's no real difficulty in mapping glyphs to multi-character sequences and vice versa; I could write the algorithm myself, but what I don't know is how such things get integrated into the MediaWiki software itself, nor how to deploy upgrades to that software. I'll ask on the en: village pump for some pointers
- I understand the Serbian has a special syntax ( --{ }-- I believe ) to mark words that are not to be converted. And yes, we'll have to convert all pages to one variant, or handle source in multiple variants as I believe zh: does. I'll inquire as to how this is done as well.
- Unfortunately, I'll be away a few days so I won't be able to put much effort into this until next week. Moszczynski 16:54, 8 March 2007 (UTC)
-
-
- The most simple solution to me would be to have all the articles wrritten in syllabics translated to latin because sylabics>>latin is always exact and east to do, but latin to syllabics is not because of combination of letters or one particular letter can be differant syllabic letters. someone suggested that since inuktitut has a relatiovely small vocabulary that then maybe we could have every inuktitut word in a program for autoconvert --{ }-- seems ok but it also seems annoyinh maybe somthing simpler to type not requiring so many keystrokes, that is 9 keystokes very annoying, maybe just _{ }_ which only requires 4 strokes and not letting the shioft go button let go of it. i dont think legibility is hampered in the least thats a rather negative observation in my opinion. maybe we can do somthing much more simple, the interface can be in both syllabics and latin but any given article can only be in one or the other there wouldnt be a ᐊᒥᖅ/amiq page there would be either a «ᐊᒥᖅ» page or a «amiq» page or both hopefully but both need not exist, or maybe a ᐊᒥᖅ/amiq page and its in sylllabics and if you want latin script you click on a latin script tab but the two articles dont have to be identical at all one could be 1 page long the other a paragrapgh or one could be a mini stub the otehr a short stub. remember we can always choose a differant way to do things we can look to other wikis for ideas but we dont have to immitate them. oh and inuktitut never looks hideous whether its written in syllabics or latin letters or both alongside both of them sidebyside.Qrc2006 21:40, 8 March 2007 (UTC)
-
-
-
-
- But if is technically possible to have a full auto-conversion with relatively limited effort, than why not do it? I have some difficulties understanding your reservations. BTW, the marker used in srwiki is just "-{}-", so, four keystrokes. See [1] for how they do it.--Johannes Rohr 22:19, 8 March 2007 (UTC)
-
-
Ok, so here is some technical info. The auto-conversion was intially developed for the chinese wiki, but was never fully finished. At some point, I took over, and implemented stuff we needed for sr.wiki. So, the system is not yet perfect, but works reasonably well. How we setup things at sr.wiki is as follows. There are 3 views of an article: default (no conversion, just shows what's in the DB), cyrillic and latin. The anonymous users see the default one if they don't click on one of the script tabs. When user edits the article, he/she edits the database version (i.e. no conversion is taking place in edit boxes). Each article can be written in any script, but it must be in one script. I.e. if article has been started in latin, it needs to be written in latin by subsequent editors - this is a community-enforced restriction. As noted above, there are tags to disable conversion of certain parts of the text (-{}-), to disable conversion of titles (__NOTC__), talk pages are not converted by default, also all the links, categories, templates can be written in both scripts. This means that even when the article title is in latin, it can be referenced in another article by its cyrillic name.
Now the programming part. I think it's fairly easy to develop the conversion for Inuktitut. The steps would be something like:
- Find a php programmer, of just a person who programmed something at some point in their lives. No mastery is required.
- Get a svn version of mediawiki (or latest stable) and install it on your computer. Look at languages/classes/LanguageSr*.php. You just copy and rename them to something like LanguageIu*.php. In LanguageSr.php you'll find the conversion table. It should be straightforward how to modify it to your needs. You might need to edit some other files also, like Names.php to get your variant names right.
- Test that everything is working for you on your local copy of mediawiki
- Make a patch (if you're using svn just do "svn diff", else use the diff tool)
- Make a bug report on mediawiki bugzilla and attach your patch to it
- Wait! Getting stuff into mediawiki is never too fast, it can take from weeks to months
I hope I covered some of the topics, but now I need to go to sleep. :) If you have futher question please leave them on the english page on sr.wiki. Rainman 04:21, 9 March 2007 (UTC)
-
-
- i am confused at the fact that a latin article cannot be edited in cyrillic, why not upload the source in cyrillic and edit it and then anyone see it converted to latin or arabic if thats thier preference doesnt it work?Qrc2006 22:25, 9 March 2007 (UTC)
-
There are technical problems with this approach. First, all of the database would need to be converted to cyrillic which could be a problem provided that there is still a lot on unescaped english text, which would turn into cyrillic nonsense. Second, this approach would involve converting wikitext (and not html how it's done now). It would be better to convert wikitext, but the original chinese implementation just doesn't do it, and for good reason: parsing. If parsing is wrong it will mess up the article source and possibly convert stuff that doesn't need to converted. Currently people are thinking about making GUI editing, this means that wikitext is converted to some meta-text suited for GUI, and then converted back to plain wikitext. The problem with this approach is: nobody can really duplicate the behaviour of the parser, since it's very messy, written by many people with minimal documentation, and full of hacks. Anyway, it's not entirely impossible, but would need a dedicated programmer, of which there are non yet. --Rainman 14:13, 10 March 2007 (UTC)
- I've been looking at the code and seeing what the issues are and trying (and failing) to get a copy of the wiki deployed on my local computer. The problem with having articles ion only one language is that the Inuktitut-speaking community is so small that kind of divide would be a serious obstacle to contribution. I think we can have transparent dual-script reading and editing by doing the following.
-
- Whenever a user saves an article, any syllabics are converted to the latin script
- Whenever a user views an article and requests syllabics, any latin characters are converted to syllabics
- Whenever a user clicks Edit on an article, any latin characters are converted to syllabics before being displayed in the edit box (if his preferences are syllabics)
- This way, all source is in the Latin script (we'd have to convert existing articles, but we'll have to do that no matter what). Latin is better for source partly because most non-Inuktitut text (ie links, image filenames, etc) is likely to be in Latin, but mostly because syllabic text would be less likely to contain invalid text, so if someone puts in a v or a misplaced h we'd rather have the error be on display than on save to db. Other than that, I'd recommend sticking to the sr.wiki conventions on everything else, ie -{}- for escapes, no converting talk pages, etc. This way, I don't think we'll have to resort to parsing wikitext, but I'm not familiar enough with the software to be certain. I'll do my best to see what problems there are with the approach above, and no doubt tell you in a few days that it's not feasible :)
- Thoughts? Moszczynski 04:50, 13 March 2007 (UTC)
-
- Inuktitut is almost always written in syllabics by the majority of writers, only used syllabics in a Northwest Territories.Qrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 05:19, 13 March 2007 (UTC)
- Also there is no way of converting latin script to syllabics, because there are many possibilities when it comes to letter comination, however translating sylabics to roman letters is exact since the combinations of roman letters equivalent to syllabics is always known. for example:
- kuu could be, «ᑰ» «ᑯᐅ» or «ᒃᐅᐅ»
- but ᑰ will always be kuu, ᑯᐅ will always be kuu, and ᒃᐅᐅ will always be kuu
- unless this can be overcome, only syllabics can be converted to latin, not latin to syllabicsQrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 05:24, 13 March 2007 (UTC)
-
- words woild be spelled incorrectly allthough phonetically they would be pronounced rightQrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 05:25, 13 March 2007 (UTC)
- Do instances actually exist where «ᑯᐅ» or «ᒃᐅᐅ» is preferred to «ᑰ»? I didn't realise that Inuktitut could have two conecutive instances of a single vowel that did not constitute a long vowel. If this is indeed the case, it must surely be rare; I'm sure we could have a special syntax to specify the exceptional cases. Moszczynski 05:33, 13 March 2007 (UTC)
- words woild be spelled incorrectly allthough phonetically they would be pronounced rightQrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 05:25, 13 March 2007 (UTC)
-
I think you are underestimating the job of conversion. According to your scheme, at some point you might have something like this in you database:
{| border = 1 | sometext {{template|key = Inuktitut text}} | ... |} <gallery> Image:Wikipedesketch1.png|The Wikipede </gallery> ... etc
Now, to be able to convert this stuff from the database to Inuktut, you need to parse the wiki text, because you don't want to convert template keys, you don't want to convert table keywords, you don't want to convert the Image name, but you want to convert to image caption, maybe the template parameter, possibly template name? ... And in MediaWiki there are hooks to introduce new syntax, so every new hook will break the conversion, bacause hooks are introduced to render new syntax into html ... --Rainman 15:44, 13 March 2007 (UTC)
So, to conclude, in theory, you could do it without parsing, by just showing the english keywords as badly converted Inuktut characters, and learn to ignore them, but I think that wouldn't be a very elegant solution since it might render the wikitext unreadable. --Rainman 16:00, 13 March 2007 (UTC)
- I think what I might be misunderstanding is how that's different from what happens now...I mean, at some point, the text gets converted, and right now (as I understand it) it's on article save. What's the difference if there's a syllabic-> latin conversion on save and a latin-> syllabic conversion on render? Are there extra difficulties other than simply the time at which the conversion happens? I think I'm missing something important. Moszczynski 16:25, 13 March 2007 (UTC)
Here is your proposal:
-
- Whenever a user saves an article, any syllabics are converted to the latin script
This is OK. It it not done now, i.e. there is no conversion when one saves the article, it just gets saved to the DB however you type it (mixed letters or not). But it is easy to implement.
-
- Whenever a user views an article and requests syllabics, any latin characters are converted to syllabics
OK, too. This is how it works now. Wiki parser outputs nicely formated html with tables, etc, convertor picks up the html (actually html-like code, since some stuff is absent, like nowiki tags) and converts everything thats not an html tag into the selected script.
-
- Whenever a user clicks Edit on an article, any latin characters are converted to syllabics before being displayed in the edit box (if his preferences are syllabics)
Now here is the problem. You have a source in the edit box, which is not parsed, but just dumped into the edit box. If you are not to mess up the wiki syntax with converting _everything_ to cyrillic or syllabic script you need to parse the wiki text. --Rainman 19:09, 13 March 2007 (UTC)
Ahh, I understand now how everything works, thanks for the explanation. Well in that case, implementing a system like on the Serbian wikipedia would be easy enough, and mixed source, I think, would be okay. But to have transparently dual-script editing is a pretty massive challenge; we'd need hooks at almost every stage of parsing. So I guess we should move ahead and just implement the viewing...? Moszczynski 05:35, 14 March 2007 (UTC)
Has the same approached been used in the Kazakh Wikipedia? There, they have three-way conversion (Latin, Arabic, Cyrillic) and I believe that Cyrillic/Latin <-> Arabic conversion is not always one to one, so that there might be the same kind of ambiguities as between Syllabics and Latin.
P.S.: Is anyone working on this now? --Johannes Rohr 06:39, 2 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)
-
- whatever the case, whoever wishes do it now and try come up with whatever.Qrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 21:57, 10 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] ᐱᒋᕚ ᒧᓗᕗᖅ/pigivaa muluvuk/have been gone a long time
- ai takugiik qaritaujaq inuquti siqumiksimajuq
- ᐊᐃ ᑕᑯᒌᒃ ᖃᕆᑕᐅᔭᖅ ᐃᓄᖁᑎ ᓯᖁᒥᒃᓯᒪᔪᖅ
- hi friends my computer is broken
Qrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 23:34, 6 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)
- Oh, bad. So I hope, you will soon find a way to repair it (or getting a new one). Have a good time, --Thogo (Talk) 21:41, 7 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)
-
-
- ᐅᑎᖅᐹ ᐅᖃᐅᑕᐅᕗᖅ.utiqpaa uqautauvuq.I said i returned.Qrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 08:45, 9 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)
-
[ᓱᖁᓯᖅᐹ ᐃᓚᖏᐅᑎᑎᑦᓯᒍᑎ/suqusiqpaa ilangiutititsiguti] special charicters box
the special characters' box at the bottom of the screen isnot working, why? can anyone help or fix it? Qrc2006•ᐊᓪᓚᖁᑎᒃᑲallarkutikka•ᑕᓕᐊᖅtaliaq 21:58, 10 ᐊᐃᐳᕆᓪ/aipuril 2007 (UTC)