Improving Cantabile's Text Rendering


#1

Just wanted to post a little update about what I’ve been up to with Cantabile lately.

Firstly, I’ve been on a break for a bit because I felt at risk of burning out… but I’m ready to get back into it and was looking for something fun to start on when somebody sent me an email about text rendering which reminded me it’s something I really need to fix.

When I switched Cantabile’s rendering to Skia for the 35xx series builds I didn’t realize at the time that it’s text rendering was so poor. Not the rendering quality of the text itself, but its inability to handle a bunch of situations that the old version got for free from Window’s text rendering. So for the last couple of weeks I’ve been doing a deep dive on Unicode, text rendering and layout. Wow… it’s complicated.

So what’s currently broken with text rendering?

  • For Latin based languages not too much - basically it’s just some unicode characters don’t render properly (eg: emoji’s tend to come out as boxes).

  • For other languages things are much worse. Asian dialects tend to not render at all. Right-to-left languages (eg: arabic) and other complex scripts (hindi) render characters but they’re incorrectly placed and ordered.

Also, I know there’s a lot of people are waiting on rich text support in show notes so I’ve decided to bite off more that I can chew and fix everything text rendering related at once:

  1. Font Fallback - the ability to automatically switch to a different font when the selected font doesn’t have a particular character (eg: automatically use an emoji typeface for emojis or to a chinese typeface for chinese characters etc…)
  2. Text Shaping - required to render complex scripts correctly. eg: making sure accents, acutes etc… get rendered in the correct place. This is especially important for arabic and hindi.
  3. Correct Line Wrapping - currently Cantabile basically just wraps where there’s a space. The Unicode consortium has 48 page spec on how it’s supposed to work. (yes, 48 pages - fun, not)
  4. Bidirectional Text Support - mixed left-to-right and right-to-left text.
  5. Rich Text - mixed font, font size, colors, bold, italic etc…
  6. ChordPro rendering - Maybe - this isn’t really part of this but related so I might tackle it at the same time.

This is a pretty big job but I think I’ve got a good handle on it now and have started on the implementation.

Besides users working in other languages this will have the biggest impact in Show Notes where it will form the basis for either a) ability to include rich text in the existing show notes or perhaps b) a complete re-write of show notes addressing all (or at least hopefully most) of the issues that have been raised.

OK enough jibber-jabber, back to coffee and coding.

Brad


#2

Just make sure you take care of yourself as, Brad. This is partly out of selfishness, because I know that it you take good care of yourself, we Cantabile users are being taken good care of as well.


#3

Well said. :slight_smile:


#4

I’m not a music app coder but done enough coding in the past to comment that this project looks like more fun than sorting out VST3 or the Mac stuff.


#5

Hey All,

Thanks for the support guys.

A quick update on this… it’s going quite well so far:

Unicode Trie Data Structure

Because much of the work in laying out text requires looking up of information about each character and since there are over 1,000,000 possible Unicode characters there needs to be a fast, but concise way of doing this. The recommendation for this is a “Unicode Trie” and I’ve ported an existing implementation and it’s passing all unit tests.

Line Break Algorithm

The line break algorithm looks at a string of characters and works out the valid places for it to be broken into lines. Again I’ve ported an existing implementation and it’s passing all unit tests.

Bidi Algorithm

The Bi-directional text algorithm (aka Bidi) looks at a string of unicode characters and works out which ones should be treated as left-to-right and which are right-to-left. This is a fairly complicated process but I’ve ported the Unicode org’s reference implementation and it’s passing all unit tests.

Line Break and Bidi Uncode Trie Data Generator

The above two algorithms require different sets of unicode character classification data. I’ve written a script that downloads the character classifications from the Unicode database and builds a trie for each which is embedded as a resource in GuiKit. Total size of both unicode tries combined is about 10K. Pretty happy with that - though I haven’t bench tested it for performance yet.

Font Fallback Experiments

Font Fallback is the process of switching to a different font for characters that the selected font doesn’t support. For example if you’re trying to draw emoji characters with the Arial font, it won’t work because Arial doesn’t have those characters.

I’ve figured out how to select a similar font with the required characters and implemented an algorithm that breaks character strings into runs of original font vs fallback font and some basic rendering experiments are working:

image

Font Shaping Experiments

Font shaping is the process of correctly ordering and positioning character glyphs so they make sense. For Latin languages it’s basically one character after the other. For complex scripts its, well… complex.

Currently GuiKit (and hence Cantabile) will render Arabic text like so: (it’s wrong)

image

HarfBuzzSharp is an open source library that handles font shaping, and after a bit of experimentation I ended up with this correct rendering of the same text:

image

What’s Next?

So that’s all the bits and pieces figured out. Now to try and bring it all together and do everything at once.

The first part of that will be to build a document object model to specify the text to render and its attributes (which bits are bold, green, etc…). That’s tomorrow’s job…


#6

Once again, impressive Brad ! <3

Be carefull indeed on that burnout, I know what I’m saying.
We’re all looking forward to improvements, (me not the least, sorry for the spam hahah :wink: )
but nothing is worth your mental and physical sanity.

Maybe you should consider working together with some people.
I think allot of C3 users would and could help with some tasks. And I see some are very skilled here. It would benefit the succes and take some pressure from your shoulders.
Maybe even simple tasks can be co-moderated, like answering here on the forum etc…

Looks already very promissing ! :slight_smile:
For me the Chordpro and or markup formatting would be very handy.


#7

:crazy_face::dizzy_face:


#8

Another little progress update. Today I got a basic DOM in place and single line layout working, including font styles, font fallback, mixed LTR/RTL text and the text is aligning on the baseline correctly. It’s not perfect (I think the exclamation point near the Arabic text should be on the left), but not a bad start…

Next is line wrapping which is going to take a lot more work (and probably too much coffee).

Also, I remembered Cantabile needs the ability to truncate text that doesn’t fit and put in ellipsis (…) indicators. I’m not sure how easily that’s going to fit into all this.


#9

I agree, impressive big Brad!


#10

Well I wasn’t wrong: line wrapping is complicated but after three or four failed and discarded attempts (including one where I had everything completely inside out) I think I’ve got the basics working. It looks simple on the surface, but there’s a lot going on here:

That’s a left-to-right paragraph with embedded right-to left text. Next up is the opposite: right-to-left paragraph with embedded left-to-right text.


Update: Damn it!. Thought I had right, but nope… RTL text isn’t wrapping correctly. Back to the drawing board. :frowning:


#11

Complex stuff indeed. The long-term benefits will make it all worth it.


#12

I think this might be the 5th rewrite of the line wrapping, and this time it might, maybe, possibly be actually right. Maybe.

The difference is subtle, but watch the Arabic words closely and you’ll see they need to be re-ordered when they’re split across lines. It also complicates end of line whitespace, which needs to be handled differently when the text at the end of the line is going right to left - to make sure it doesn’t end up either in the middle of the current line nor at the start of the next line.

Anyway, the good news is I’m back to where I was this morning and I think this new approach might simplify the right-to-left paragraph layout.


#13

Video doesn’t play? At least not for me?


#14

Try now…       


#15

Yup, works. Very impressive!


#16

That’s the sort of thing I’d spend six weeks fussing with, only to find that there was a fully tested library for that very thing already up on Rosetta Code . . .

Looks good!


#17

I did look and there is one under development for Avolonia and I’ve been chatting with the developer who’s working on it.

In the meantime, I think all the basics are now working:


#18

Great progress @brad !! I have been waiting to convert my notes for a while, really looking forward to this down the way…ran the latest build at 2 back to back nights of 4 hr 50 + song gigs. Solid as could be :wink:


#19

I can also affirm @dave_dore’s findings. I am on my 4th gig tonight with the latest build. Very solid, very pleased.


#20

Superscript and Subscript working…