Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Choosing a more optimal `String` type (swatinem.de)
25 points by mhutter on Sept 22, 2023 | hide | past | favorite | 19 comments



These leaves off several relevant string types, including ecow which was encouraged over smol_str [0]

See https://github.com/rosetta-rs/string-rosetta-rs for some more analysis. HipStr also has a decent table [1]

[0] https://www.reddit.com/r/rust/comments/117ksvr/ecow_compact_...

[1] https://crates.io/crates/hipstr


There should be a from_raw_parts-like interface for Arc that can construct one out of a pointer to a T + control block. That way you can shift the string data over and stick the control block inline (which is how I assume it's laid out in memory) and convert String to Arc<str> for free (well not free, shifting takes O(n), but so does the copying in Arc::new, so in the end you do end up avoiding an allocation for free)


Unfortunately, this can't really be done efficiently with how allocation works in Rust. A String is always allocated with alignment 1, but an Arc's control block requires usize alignment. An allocated pointer can't be deallocated with a different alignment than it was allocated with, nor can its alignment be changed when it is reallocated. Thus, the from_raw_parts() idea would be unusable for any types with less than usize alignment.

(And the control block really does need to be aligned, since otherwise we couldn't perform atomic operations on it.)


> A String is always allocated with alignment 1, but an Arc's control block requires usize alignment.

I'll add this to my list of reasons why alignment is terrible.


Wow that is annoying, especially since any malloc is going to give you 8-byte aligned allocations anyways.


Optimal is optimal. There is such thing as “more optimal”. It’s either optimal or it isn’t.


Something that is optimal under some set of assumptions and under some specific conditions is likely not optimal under all conditions. Does that mean it's not optimal under any conditions? If not, is something that is optimal under more conditions not "more optimal"?


Aaah, `Cow`, the unloved child ...


I love Cow, except when it goes mut.


&mut Cow<Disease>


I presume they're trying to avoid having to worry about the lifetime, so they want something that owns its contents.


‘More optimal' is oxymoronic; ‘better’ is just that.


From 1983:

Although "optimum" is an absolute term, like "unique", it became common verbal practice to make it relative: "not quite optimum" or "less optimum" or "not very optimum". Mel called the maximum time-delay locations the "most pessimum".

https://users.cs.utah.edu/~elb/folklore/mel-annotated/node1....


Even worse! ‘Optimum’ in English is a noun, not an adjective.


From 1993:

Verbing weirds language.

https://www.gocomics.com/calvinandhobbes/1993/01/25


I prefer "more betterer".


And the superlative, most betterest.


Why use strings and pass them around in the first place? Why not limit the use of String types to system boundaries where data must be serialized into or from character-oriented streams?


The typical strategy for doing that in a context like this is string interning. That's literally what most of the types described in the post are doing behind the scenes, in one form or another. The post is literally about doing your suggestion -- just using a library to do so, and leveraging the type system to reduce some of the boilerplate.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: