How is the "proper" right-to-left text written, and wrapped?

Started by Crimson Wizard, Sun 09/04/2023 01:42:22

Previous topic - Next topic

Snarky

#20
Quote from: Crimson Wizard on Fri 14/04/2023 12:30:42Afaik this is what Mehrdad is currently doing: he puts linebreaks himself, everywhere where necessary.

Right, that's the simple solution, but what I meant was that one could also just write a "reversed-pRTL" (pseudo-right-to-left) linebreak algorithm in script (measuring line widths from the end of the string, and reordering the lines so that the end of the string is at the top), rather than implement it in the engine.

Because to me this seems like cruft: special-case code that isn't useful for 99% of users, and adds complications to the engine while not really being the "right" solution to the problem.

(Right-alignment as an option in more GUI Controls would be useful, but shouldn't be tied to a RTL setting.)

Crimson Wizard

#21
Quote from: Snarky on Fri 14/04/2023 12:37:23
Quote from: Crimson Wizard on Fri 14/04/2023 12:30:42Afaik this is what Mehrdad is currently doing: he puts linebreaks himself, everywhere where necessary.

Right, that's the simple solution, but what I meant was that you could also just write a "reversed-pRTL" (pseudo-left-to-right) linebreak algorithm in script (measuring line widths from the end of the string, and reordering the lines so that the end of the string is at the top), rather than implement it in the engine.

I guess you may try that, except you will have to then apply this everywhere throughout the game, for each existing string that may be wrapped:
- when assigning a string to gui labels;
- when calling Display (and variants);
- when calling DrawingSurface.DrawTextWrapped.
- when displaying a speech (so have to write custom speech function);
- when displaying dialog options (so have to write custom dialog option rendering)

This will also make adding e.g. Arabic / Persian translations to existing games quite difficult.

Implementing this as an option to text wrapping algorithm would likely cover this case for the time being (again, I cannot tell when is the new font render estimate, and which new problems that would bring).

Quote from: Snarky on Fri 14/04/2023 12:37:23(Right-alignment as an option in more GUI Controls would be useful, but shouldn't be tied to a RTL setting.)

Controls have alignment setting, but it's not that, they do not reverse text, ever. I don't know why; this is either a regression in the new 3.* engine, or that RTL feature was never complete.
This won't be an issue with this converter solution though, as it does not need text reverse.

Actually, if we have the font render that can account for direction control chars, as eri0o mentioned, maybe we won't need RTL setting in the engine at all (rather than for backwards compatibility).

Snarky

Quote from: Crimson Wizard on Fri 14/04/2023 12:43:52I guess you may try that, except you will have to then apply this everywhere throughout the game, for each existing string that may be wrapped:

Easily done with a few helper functions: DisplayRtl, SayRtl, DrawStringWrappedRtl, Label.SetTextRtl...
Yes, you might have to implement custom dialog options rendering, but again, this seems like an edge-case of an edge-case.

Quote from: Crimson Wizard on Fri 14/04/2023 12:43:52Actually, if we have the font render that can account for direction control chars, as eri0o mentioned, maybe we won't need RTL setting in the engine at all (rather than for backwards compatibility).

We might want a way to do RTL text input, but it should probably be a setting per TextBox rather than game-wide. (Though if the input box was made fully bidi-compatible, the only effect of this setting might be that the caret would start off right-aligned rather than left-aligned.) And in the mean time, the TextField module could relatively easily be modified to do so.

Crimson Wizard

#23
Quote from: Snarky on Fri 14/04/2023 13:36:10
Quote from: Crimson Wizard on Fri 14/04/2023 12:43:52I guess you may try that, except you will have to then apply this everywhere throughout the game, for each existing string that may be wrapped:

Easily done with a few helper functions: DisplayRtl, SayRtl, DrawStringWrappedRtl, Label.SetTextRtl...
Yes, you might have to implement custom dialog options rendering, but again, this seems like an edge-case of an edge-case.

Well, then someone will have to help with these script functions, or write a module...

At least in case of a speech one would have to comply to the undocumented width calculations of the speech overlays. Or write completely custom speech to not depend on that.

Another thing to consider is that you will have to add these in your game if you even guess that it might have Arabic/Persian translation, because you cannot add these helper functions through the translation file itself. If the game was already done when you realized that you want such translation, then you'll have to edit all the text assignments and Say calls in script.

Snarky

Quote from: Crimson Wizard on Fri 14/04/2023 15:18:25Well, then someone will have to help with these script functions, or write a module...

At least in case of a speech one would have to comply to the undocumented width calculations of the speech overlays. Or write completely custom speech to not depend on that.

The SpeechBubble module already has a version of this calculation (for LucasArts speech).

(Edit: The SpeechBubble calculation differs a little from the engine calculation, in order to accommodate the border around a speech bubble. Example modified to match engine.)

Code: ags
int lecSpeechWidth(Character* c)
{
  int vpWidth = Screen.Viewport.Width;
  Point* cp = Screen.RoomToScreenPoint(c.x, c.y);
  int cx = cp.x;
  int w = vpWidth/2 + vpWidth/6; 
  if(cx < vpWidth/4 || cx > vpWidth - vpWidth/4)
    w -= vpWidth/5;
  return w;
}

So, assuming a String.BreakRtl(int width, FontType font) extender function, SayRtl() could be implemented like so:

Code: ags
function SayRtl(this Character*, String message)
{
  String rtlMessage = message.BreakRtl(lecSpeechWidth(this), Game.SpeechFont);
  this.Say(rtlMessage);
}

I think in each case all the helper functions would be a one- or two-liner.

Quote from: Crimson Wizard on Fri 14/04/2023 15:18:25If the game was already done when you realized that you want such translation, then you'll have to edit all the text assignments and Say calls in script.

Is this really a problem in practice, though? If this situation were to occur, there are two options:

1. Edit all the relevant calls and string assignments and rebuild. Most of that can probably be done by search-replace.
2. Do the linewrapping manually in the translation file.

Keep in mind that we're talking about support for a hacky workaround as a stopgap solution.

I don't know, man. You do what you want. It's not like I can stop you. But you do periodically bemoan how you keep getting distracted by patching up and making minor additions and tweaks to the current, out-of-date code instead of focusing on a forward-looking architecture.

Crimson Wizard

#25
Well, I'd like to see if this script solution would work for @Mehrdad. Also afaik @Wesley wanted to try Arabic translations in his game.

I wrote an experimental change for the engine already (found here), but I will postpone merging until it is more clear whether scripting solution works conveniently.

Crimson Wizard

#26
Something has to write a script module for handling these special cases.

I think what is required is this:
* a function that calculates width of drawn text (given string, font and width limit) and inserts linebreaks in it.
* a special handling for the case when the text has to be read Right-to-left. Probably this means that either the splitting has to be done in reverse, or the text has to be reversed char by char twice: first before the splitting and then each separate part (between the linebreaks) is reversed on its own again.


For the reference, I had a function that splits text written long time ago for the TypedText module:
https://github.com/ivan-mogilko/ags-script-modules/blob/f55bc9015f6e6443de7f4d293b5b199779b79e88/scripts/gui/TypedText/TypedText.asc#L41
but it actually had a bug, mentioned here, with a proposed fix:
https://github.com/ivan-mogilko/ags-script-modules/issues/5

I had plans to pick this function out as a separate module, but never had found time to do this.

Or it could be rewritten from scratch.

Snarky

A linebreak function would be useful to have in general. I'm interested in having a crack at it. One question: how will manual linebreaks made using "/n" appear in the reversed "RTL" string? As "/n", as "n/", or already converted to a newline code? (Actually, it might be best to optionally support all of these as well as the old ']', using some kind of configuration bit field.)

Crimson Wizard

#28
Quote from: Snarky on Fri 21/04/2023 15:05:23One question: how will manual linebreaks made using "/n" appear in the reversed "RTL" string? As "/n", as "n/", or already converted to a newline code? (Actually, it might be best to optionally support all of these as well as the old ']', using some kind of configuration bit field.)

No, the line breaking chars do not need to be reversed, that makes no sense whatsoever. The `\n` (escaped 'n') is not treated as two characters, it's processed as a  single special character called LF (ascii code 10).
Converting '[' to ']' will only make sense if that's a displayed character, but won't if it's a special break character, in which case it must retain its code.
Both '\n' and '[' are treated by AGS during wrapped string drawing (I think it converts one to another for consistency, but I forgot the details); if they will be reversed, then nothing will work.

Snarky

Quote from: Crimson Wizard on Fri 21/04/2023 15:13:27No, the line breaking chars do not need to be reversed, that makes no sense whatsoever. The `\n` (escaped 'n') is not treated as two characters, it's processed as a  single special character called LF (ascii code 10).
Converting '[' to ']' will only make sense if that's a displayed character, but won't if it's a special break character, in which case it must retain its code.
Both '\n' and '[' are treated by AGS during wrapped string drawing (I think it converts one to another for consistency, but I forgot the details); if they will be reversed, then nothing will work.

If you think I meant to turn '\' into '/' and '[' into ']', then I certainly agree that makes no sense whatsoever. That was merely me not remembering which symbols are actually used.

As for the rest, it depends on which character code is actually stored in the string at string manipulation time. If it is always LF, then things are simple and it is clear what to do. However, you say that the conversion happens "during wrapped string drawing," and my recollection is also that you do sometimes have access to strings with '[' (and "\n"?) not yet parsed. In that case we have to ensure that we are handling them correctly. It might be that the '[' is a linebreak and should be treated as such in our logic, or it might be that it really is a '[' character, and should simply be kept and wrapped as any other character.

And because we are dealing with reversed strings, with "\n" it becomes a question of whether it was inserted before or after the reversing. If it was before, it might conceivably appear as "n\" (though if correctly converted from a proper bidirectional text, I believe it shouldn't). Since AGS at no point accepts that as a linebreak, we might even need to do a replacement.

Crimson Wizard

#30
Quote from: Snarky on Fri 21/04/2023 17:33:28As for the rest, it depends on which character code is actually stored in the string at string manipulation time. If it is always LF, then things are simple and it is clear what to do.

AFAIK all the true escape sequences ('\\', '\n', '\r', '\t' and so forth) are dealt with at compilation time, and all the '\n' in the string you typed in the Editor will be LF in the compiled data.

What engine does at runtime is converting '[' into '\n'. This is where it tests for any backslashes before the '['.
This is done just before the string wrapping algorithm, so that the latter could work strictly with '\n's.

So, at the time when engine is drawing the line of text, it has to be either "\n" or "\\[". I suppose it's best to just replace everything with '\n' in script.
How to treat the "\\[" and "[\\" in otherwise reverse text case in script, - that is indeed open for interpretations...
One solution is to ignore these completely, and suggest users to use '\n' in their texts where they want a manual linebreak, because '\n' always becomes a single character. Then they will be dealt with by a compiler. Although, I don't know how it will be displayed if you type real Unicode text in RTL mode.

Snarky

I suppose I should share the module I made here.

In implementing it, I realized a few things:

While the logic to calculate the line width of LucasArts speech is relatively simple, the calculations for Sierra speech are far more complicated, depending both on Text Window settings and on the speaking character's Speech View. There are also a lot of different configuration settings and other properties that affect how text is displayed, so replicating the full logic would indeed be pretty complicated. I've taken a simpler approach of automatically calculating it for LucasArts speech, but just allowing users to set it as a static value for Sierra speech.

There are a few subtleties with "[" and "\n". The first is that if the player enters text via a TextBox, "\n" is not converted to a LF character, but remains a backslash-n sequence in the String (like when you escape it in Strings in the editor), and is displayed as such. (This is probably correct behavior for most normal purposes.) However, AGS still interprets a "[" entered in a TextBox as a linebreak for most String display purposes, so there is arguably some inconsistency there.

The second is that not all AGS controls allow linebreaks in Strings, and those that don't (Buttons, TextBoxes and ListBoxes, IIRC) treat "\n" and "[" differently. These controls render "[" normally as an open-square-bracket, while a "\n" (i.e., LF character) is ignored or turned into a space (I forget which).

Finally, the linewrapping code should check for the escape sequence "\[" that AGS uses to actually display an open-square-bracket, so that it doesn't incorrectly interpret this as a linebreak. The current version of the module does not correctly deal with this in RTL mode.

SMF spam blocked by CleanTalk