Latest news

June 24, 2022 by Don Hosek

Since it’s been a while since I’ve posted an update, current status: I’m working on the parser, but had a bit of a digression to create some Unicode support with the interface that I need for finl. The Unicode Segmentation crate is a nice piece of code, but it assumes that it is to be the primary interface for iterating over input and I, instead, want to be able to augment the CharIndices iterator that I’m using in the parser. So I had to create my own implementation of Unicode grapheme segmentation. I had initially (incorrectly) assumed I would also need to have to implement Unicode category support as well for this but it turned out I was wrong. Regardless, I did it anyway because I find this kind of coding fun. My implementation of Unicode category identification is significantly faster than the Unicode Categories crate (by a factor of ten according to my benchmarks) which is good because I call this on all command names so it will have a noticeable improvement in speed. My segmentation code on the other hand is running much slower than the Unicode segmentation crate, but it may be an artifact of the fact of how the two are implemented (I return a new String from a call on the CharIndices iterator while they, because they’re creating an iterator over *str, are able to return *str references and avoid the copy and allocate). As a result, to make sure that my algorithm is not flawed (it does at least pass all the tests), I’m also implementing the algorithm as an iterator on a *str for benchmarking purposes. I then need to do some documentation and I will put a 1.0 crate for finl_unicode out into the world. I’m torn between hoping that I won’t have to implement more Unicode support and hoping that I will. This stuff is fun to code.¹

If anyone wants to hire me to write Unicode support code, I’m listening.

Comments |0|

Cancel

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Legend *) Required fields are marked
**) You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Category: Uncategorized

finl is not LaTeX

Latest news

Comments |0|

Cancel

Recent Posts

Recent Comments

Archives

Categories

Meta