Mind I ask what format your testsuite is in and how do you run them to get these results? Sounds like a fun project and would be interested in creating my own.
I can answer your previous question regarding development: maybe.
I have done some work on Crystal since 8 and recently pulled & merged upstream into Crystal locally.
Unfortunately, there has been a quite significant regression in performance. I am currently attempting to track down the source(s) of the issue, but if I am unable to, development may come to an end. Currently, I do not know whether it is something I have done (most likely), or if the searches have diverged so much that many of the tuned values in Stockfish are simply much worse for Crystal. Hopefully, it is the former and I am able to track down the issues without much trouble. If it is the latter, development may have to come to an end as I don't really have the time or resources to produce a fully "tuned" version of Crystal.