Conversation

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

What if #iocaine supported generating images too?
(inspired by @pengfold's FakeJPEG tool)

2
1
1

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

There are a number of rust crates that seem to make it easy to create valid JPEGs (or PNGs). The question is, what should they contain, and is the generation fast enough?

Or, perhaps another approach: what if SVG, but partially built by a markov generator? Can I make that valid? Is it something the scrapers would even care about? Or should I stick to playing with jpeg & png?

3
1
0

@algernon for sure the alt text provided and the garbage text around them are the most important pieces.

1
1
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@alex You might be onto something.

Oh dear, this is going to be glorious.

0
0
0
@algernon As for generating SVG's, a fuzzer may be useful: https://komar.in/en/code/xmlfuzzer (haven't used this one, but the description fits)
1
0
4

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

The image crate looks like a simple way to create PNGs and JPEGs.

Question remains: what shall be the content? I suppose, the easiest is to make it user-controlled, somewhat. Provide a few template helpers that can insert various types of images at certain points.

Like, there'd be purely random, slightly randomized mandelbrot or julia fractals, and so on.

I have a ton of other ideas, but... this'll do for starters. Still need to do some benchmarking, to make sure this is even viable.

2
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Ideally, the images would be trainable too... but that's beyond my expertise. It would also require a lot of images to train on, and that's much more expensive than training on text.

So sticking with untrained, but procedurally generated or random images is the way to go for now, assuming the performance hit is acceptable.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@buherator Oooh, that looks useful. Thanks!

0
0
1

@algernon I think the key is keeping the resource use down, I'm most interested in iocaine to save resources, and if generating images uses too much resources, we're better off sticking with text.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@skyfaller Yep, agreed. If I add image generation, it will be entirely optional, and off by default.

1
0
0

@algernon @skyfaller This is why I went with "not quite jpeg" generation.

On my laptop, with FakeJPEG, I can generate around 8,500 1280x1024 "fake" jpegs per second. That's in pure Python.

Using the PIL library (where the compressor is compiled C), I can only generate about 400 per second.

Creating a JPEG from is a fairly CPU-heavy operation.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@pengfold @skyfaller Ooof.

Just did a quick test with the image crate, to create 1280x1024 jpeg of pure randomness, and it was going at 10 / sec. Generating a PNG is much faster (~100 / sec), and I suppose I could make it faster if I disabled compression.

Though, the bottleneck in this case is not just the png/jpeg compression, but the generation is costly too. Would need considerably smaller images, or faster (less naive) image generation in the first place for this to be viable.

I guess I'm not generating images just yet!

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Did some benchmarking, and nope, this is not going to happen anytime soon. Generaging a valid JPEG out of pure random data is slow. Generating a PNG is considerably faster, but still orders of magnitudes slower than generating the text.

This doesn't scale well enough, and requires too much computing to be viable.

I still see potential in it, but will need to be smarter about how it is done. Generating an image - or even multiple images - for every request is likely not sustainable. But if we generated it for some pages only, in smaller sizes, that might work.

Still would need a way to generate them fast, and for the output to be plausible.

This is not something I have experience with, nor something I can easily borrow from someplace else. So I'm going to postpone this idea for now.

3
1
0

@algernon feel free to port FakeJPEG over to your favourite language. It's not a particularly complex bit of code and I'm right here if you have questions.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@pengfold <3

I might end up doing that!

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@chfkch Or URLs! QR codes should be fast to generate, and small enough to include as data: URLs too. That's a neat idea!

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Well, that didn't take too long, and @chfkch came up with a splendid idea: What if QR codes?

They're images, they're small, and they can be generated fast. With the qrcode, image and base64 crates, I can render 5k codes / sec into a base64-encoded data: URI on a single core.

That sounds like an acceptable speed, and I can provide a qr <STRING> template helper for people who want to opt-in.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

The qrcode crate can render into SVG directly, which would likely be faster - I haven't checked yet - but my suspicion is that the crawlers would be happier to ingest a PNG.

I'll have a look at the SVG parts too, and might offer both: qr <format> <STRING> or something...

3
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

base64'd SVG is at ~45000 images / sec.

0
0
0

@algernon what about blurhash? Just a 20-ish character string to generate there.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@mike 20-ish to generate, but for crawlers to ingest it as an image, I'd still have to encode it into some kind of image (can't rely on JS to do it for me on the client side), and then it is suddenly considerably larger, isn't it?

1
0
0

@algernon I really don't know, it was just a passing thought when I saw the thread. I'm not sure how efficient the image generation side is, just that it was supposed to be simple and fast so thought it might be a fit.

Haven't thought it through more than that! 😁

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@mike Oh, it was a very intriguing idea! It might still end up being useful in the long run.

Like... if I taught iocaine to train on not only text, but images too, then I could generate blurhashes of those, and then generate a random blurhash from the learned ones, turn that into an image, and use that.

It would be slower than qr code into svg (or even png), but it would generate a different kind of image with more colors, and perhaps more plausible ones, too.

I'll definitely keep blurhash in mind, for the next time I'm playing with new garbage generation methods :)

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Hmm... what should the template helper look like... I kinda want to be able to tell it the size, too, but make most things optional.

{{ qr "<STRING>" [width height] [format]}
{{ qr "<STRING>" [format] [width height]}}

The string is required, but if there are more params: if there's only one, treat it as the format, with default width & height. If there's at least two more, then the next two are width & height, and format is an optional third.

I guess that can work. Lets see if it does in practice!

3
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

chuckle

1
0
1

@algernon do the images have to be unique, or could it just download a bunch of CC0 images, maybe apply some ai poisoning thing (if that exists), and serve a random one per url with maybe markov alt text?

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@Ember The images must be generated by iocaine, or be completely external.

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Yup, works nicely in practice, too.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

And with all the new dependencies, the static binary is only ~100k bigger. And it can generate fancy QR codes.

I just need to come up with a sensible template where the QR code fits in.

And then build a template garden, because I've seen some fancy ones!

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

I should add some benchmarks to the repo.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

This looks almost good now!

3
0
1

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

I swear the QR code's text is entirely accidental, I did not tell any crawler to fuck themselves with a pencil.

(The QR code decodes to "! The pencil felt thick and hoarse.")

1
0
0
@algernon Damn you make it harder to resist installing this thing every day!
0
0
3

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@UnePorte It's pushed to the main branch already! No docs yet, though.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

The downside of the new template is that it's ~8.5k, up from the ~2.2k with the default template iocaine ships with.

Lets see if I can make it smaller, without sacrificing much...

1
0
0

@algernon I might give it a try this week, it makes me want to build maybe an e-commerce template, with products and images, something like that

Or add an image section to the search engine one !

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@UnePorte It can only generate QR codes, though, not "real" images.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Down to 5.2k:

  • removed OpenGraph & Twitter cards
  • Now using minify_html to further minify the output (trading some CPU time to gain size reduction)
  • Manually condensed the CSS (because minify_html leaves that alone)
  • Adjusted the config to generate two paragraphs less, since the template hardcodes two extra ones.

This looks acceptable, because my currently running instance averages around 4.5-5k pages, so 5.2k is marginal increase.

3
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

OTOH, on my live deployment, there's a bit of javascript to read the page contents, that JS is not part of my current test template, and it adds about 1k...

I might remove that, because while it is funny, the bots don't click it. And then we're at an OK size.

0
0
0

@algernon yeah but since it's SVG, it can be styled via CSS, so there's probably optionns here to alter colors, shapes, etc.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@UnePorte Possibly... though the SVG is currently generated as a base64 encoded image, too: <img src="data:image/svg+xml;base64,<blah>"> - not sure how styleable that is.

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Looks like minify saves me around 500 bytes on this template, that's pretty big.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@orva Sadly, smaller QR code does not neccessarily translate to smaller PNG :(

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Oh, and minify_html can minify CSS! And JS too! I just have to enable the options. Neat. I can keep my templates readable then.

Lets see what happens if I add the TTS JS stuffs...

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Without JS minification, that's ~6.4k page size. If I enable JS minification, minify-js blows up:

thread 'tokio-runtime-worker' panicked at [...]/minify-js-0.6.0/src/minify/pass1.rs:288:81:
called `Option::unwrap()` on a `None` value

I guess I'm not minimizing JS for now!

1
0
0

@algernon Oh yeah, PNGs. I was still somehow thinking about SVGs, which would probably be smaller as there would be less paths.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@orva Yeah, SVGs would likely be smaller, indeed. I'll check that too, eventually. Though, right now, SVGs are bigger than PNGs (no compression).

It's a delicate balance =)

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Oh. It's an upstream issue. I can work that around, I guess... but that means I can't enable JS minimization by default, have to make it opt-in.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

With the workaround applies to my JS, and JS minizmiation enabled, it's 6.1k. Still too big, so the TTS parts are gonna go.

I could save a bunch if I didn't inline the CSS & JS, and would host them separately. But that would be too much work.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

HAH!

With some further tweaking, down to 5.8k with the TTS JS! Looks like it will be able to stay.

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Timer precision: 40 ns
generate                           fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ builtin_template_with_defaults  72.11 µs      │ 181.3 µs      │ 74.58 µs      │ 75.9 µs       │ 65545   │ 65545
├─ builtin_template_with_minify    71.17 µs      │ 262.3 µs      │ 73.53 µs      │ 75.16 µs      │ 66305   │ 66305
├─ qr_png                          617.7 µs      │ 1.371 ms      │ 644 µs        │ 651 µs        │ 7676    │ 7676
╰─ qr_svg_raw                      623.7 µs      │ 1.36 ms       │ 648.1 µs      │ 653 µs        │ 7653    │ 7653

Ooof. That's a big drop in performance. Lets see if I can tweak it...

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Hrm. Not likely I can do much here. The PNG encoder is already at fast setting, and I can't go "just disable compression kthxbye" on it.

The next best thing is to not generate a QR code for every page. That's a bitch to benchmark, though.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

I should be able to speed up the svg:raw case, though, because that's doing a bunch of unnecessary back-and-forth conversions. At least I think so.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

aaand nope, the back & forth conversion is inconsequential, and/or rust is smart enough to figure out it's not needed in the first place.

I guess I could compare the generated assembly, but cba. Benchmark says that my optimization attempt did jack shit.

The surprising thing is, that when I did some naive benchmarking earlier, using image directly, the svg generation was 10x faster than png. Now its in the same ballpark.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

I should profile it, maybe. Question is: do I care that much?

At the moment - no. But I'll need to figure out a way to only generate QR codes on some pages, a method that's reasonably efficient, and stable (as in: every request for the same page should end up in the same situation: either always with an image, or always without).

Starting to think handlebars might not have been the best choice for templating.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

The problem with handlebars is that the helper functions are heavy and awkward, and probably slow as heck.

Maybe I'll just go 2.0, and replace it with something like minijinja. It has filters, a better custom filter & function story, too.

There's also tera, which I have experience with (as a user, because Zola uses it), but I'm not a big fan of its syntax. If minininja ends up being slower than handlebars, or gets disqualified for some other reason, tera is an option.

Then there's askama, which I have used before, and it was okay, too. I think I'd prefer it over tera, too.

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

One of the hard things with templates is that my helper functions need the random number generator, which is initialized for each request, for consistency. The generator is used by the functions, but it should not be exposed to the templates themselves.

With handlebars, I re-create the whole handlebars instance for every request. That's very wasteful.

If I switch away, I'd like to avoid that: create the template once, including helpers, and control them from that point on with state & context or something along those lines.

...but my brain keeps falling asleep, so I guess this will be a tomorrow thing.

1
1
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

After a quick glance through the docs, I prefer minijinja > tera, and ruled out askama for now. My gut feeling is that minijinja can do everything I want, better than handlebars, and provide a richer templating language at the same time.

But! Sleep. This can wait until tomorrow.

1
1
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Well, the problem with minininja is that if I want to have an app-wide Environment, which I do not rebuild for every request, I'm dealing with lifetimes suddenly.

The current architecture of iocaine does not like lifetimes.

This means that I either do the same thing I did with handlebars, and build an Environment for each request, or rewrite large parts of iocaine. The former... isn't worth it at this point, and the latter especially not.

So this project is getting postponed for now.

3
0
0

@algernon how do you find time for this ontop of family and a full time job!!

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@arichtman By not having a full time job :P

0
0
0

@algernon I'd be interested in setting the generator seed myself sometimes, or a "lifetime" seed : it would allow me to have some consistency across pages. for instance, I could have a menu with items or a footer that stay the same across request, while still being random, making the site shell more stable and plausible

But that might not be possible ?

1
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

@UnePorte That isn't possible right now. I agree, it would be useful, it would give you more power and flexibility, but it's not something that I can sanely retrofit into the current iocaine architecture.

If (or rather, when) I rework how templating works, I'll make it easier to have more control over the RNG too. That may be a while yet, though.

0
0
0

algernon, the tired, Nth of his name... fuck this, I'm going back to sleep

Mngh. Keep coming back to this, because I am genuinely unhappy with handlebars. I might go ahead and replace it with minijinja, and keep building an Environment for every request.

Not a performance win, but more flexible, more powerful templating would be nice. It helps that it is nicer on the Rust side, too.

0
0
0