Hacker News new | past | comments | ask | show | jobs | submit | page 2 login
Ask HN: What problem are you close to solving and how can we help?
263 points by zachrip on Aug 29, 2021 | hide | past | favorite | 472 comments
Please don't list things that just need more bodies - specifically looking for intellectual blockers that can be answered in this thread.



How visually implement process in the app? I.e. how to guide users over complex process they need to do in the app to achieve success?

The process might span different medium (write email, do something in the app, check twitter, etc) and different activities multiple days. How to make sure they know what they should do next? Checklist? Emails? Slack? Wizard?


This would probably require a lot of front loaded work in your case, but if you need to train a boat load of people with very few (or zero) trainers, my favorite way to do it is (Atlassian’s Atlaskit Onboarding/Spotlight components)[https://atlaskit.atlassian.com/packages/design-system/onboar...]


Interesting... My to-go solution for this would be a detailed wiki page with screenshots, and link to that from a bunch of places. But I guess that's not an ideal solution really.


What do you use for the wiki?


World of Warcraft


we use little floating checklist from userpilot.com


I‘m trying to re-/sell cheap bulk object storage, by renting cheap dedicated servers (e.g. Hetzner), connect them using 10GbE and putting them into a big Ceph cluster.

My problem is how to bill people for consuming object storage properly. Do you do it retrospectively and take the fraud risk? Are there any pre-existing platforms that do Ceph billing?


This is what DigitalOcean did to create their block/object storage. If you're interested in doing this type of thing in a career capacity the storage team is hiring a lot of people right now. Feel free to reach out to me as I'm on that team. :)

https://www.digitalocean.com/blog/why-we-chose-ceph-to-build...


Can you explain how the economics of what you are doing would be competitive with something like S3 or B2? I feel like there could be a market/margin here but there are a lot of numbers involved to figure out the specifics.


Just take a look at the egress pricing of B2 (10USD/TB) S3 (92 USD/TB) and then look at Hetzner's 1 EUR / TB. There is quite a margin there - same thing with the storage costs (23,5,1.5).


I'm currently thinking about starting a very similar project, would you like to talk about it and exchange some learnings?


Sure, how can I reach you?


claudioreiter on telegram or via email hn@dagobert.pw


I find it hard to find communities for my ever-changing niche interests.

I’m working on community discussion boards which exist at the intersection of interests.

Eg. Mountain Biking/New Zealand, Propulsion/Submarine, Machine Learning/Aquaponics/Tomato, etc.

The search terms for interests are supplied from Wikipedia articles which avoids duplicate interests and allows for any level of granularity.

I find that key word functionality in search engines has degraded to the point that finding good content for niche interests is difficult. I’m hoping with this system I can view historical (and current) discussions around my many niche interest combos.

I’ve got the foundation done, I just need some feedback/advice on whether I’m reinventing the wheel here, or if others share this problem?


this kind of sounds like reddit to be honest, is reddit lacking something you wish would be there?


Reddit is great for finding general topics - science, news, hobbies, etc.

And the goal of this isn’t to replicate those more sweeping discussion boards because they’re great!

My issue is that once things get more niche, the subreddits are tough to find - they could have obscure names, and even just learning about them often involves another user recommending. Plus creating a subreddit for every niche intersection isn’t ideal.

With this I could zone in on exactly what I’m after, without sifting through all the unrelated stuff. With the Aquaponics/AI/Tomatoes, I’d be dealing with only that intersection point. Not the peripheral stuff.


The problem - correct string matching at scale. I am aware of fuzzy string matching. The problem is that the two strings can be > 90% similar even if the difference is, for example, one digit in the year of manufacturing. My current solution is to represent the 2 strings as similar as I can based on the available information by transforming (wrangling) the data to match the data as close as possible and then applying constraints based on make, model and year (they should be the same). It works pretty well, but I am looking for a more interactive (human-in-the-loop) solution.


I'd just slap a GUI / audit logs on top. Show the intermediate data (the “wrangling”), show the computed similarities, show the conclusion (this met that threshold, and the other was equal, so it's category seven).


Can you elaborate on the technical details: which language, library or framework would you use?


Tkinter, probably. Or a web interface. Depends on what I'm doing, honestly – the answer will always be “whatever's currently being used”.


I’m facing an issue where I store small binary data blobs within a Postgres column in order to benefit from delete cascades.

I’m considering moving the binary data into S3 and then doing the sync layer on the server (which means the front end requests the data from the backend and is given it back as a JSON object with base64 values).

Doing this manually via code isn’t impossible, just API intensive, so I’m wondering if this is a solved issue for anyone.

The why: The JSON blobs are recordings of words and sentences that can be copied between articles.


It's hard to give any advice on this until you detail the problem you're trying to solve. For example:

Where/why is your current system failing/inadequate/cumbersome?

Why do you want to move the data to s3?

Why are delete cascades important?


I've been thinking for a while about building a tool to generate ETL/ELT jobs for data warehousing. Yes, there are lots of such tools already, but I've become frustrated in one way or another with all that I've used so far (mainly with their bloated size and clunky "repositories" and inscrutable runtime engines). This "while" that I've been thinking is stretching out -- the other day I stumbled across my earliest vague musings on the subjects, and noted that they are a couple of years old already, so I'm beginning to think it's time to stop thinking and start building...

For various reasons -- mainly familiarity, FOSS ecosystem, and cross-platform compatibility -- I'm going to try to implement this in Free Pascal / Lazarus. There is one kind of component I'm definitely going to need, and if there were a ready-made one I could use in stead of building one from scratch, it would save me a lot of time and effort. I've looked around online, but so far haven't found "the perfect one". So, my question is:

Can anyone recommend a good FOSS graphical SQL query builder component -- i.e, one which presents tables and their columns to the end user, so they can specify joins and filters by clicking and dragging, etc -- for Lazarus (or in a pinch Delphi, to port to FP/L)?


I'm looking for a way to integrate a React app with an existing Vue... thing. Don't really need any communication between the two, just displaying it would be fine. My issue is: the Vue code just throws in <script> tags in the html and expects global variables (location instead of window.location), while the React code uses ES6 imports. The only partly working way I found is including <script> tags with a useEffect, but that doesn't play nicely with apex-charts for some reason, and includes forcing the html with a dangerouslySetInnerHTML after importing the existing file as a long string. Sub par, obviously. In addition, I'll probably need to include different vue apps in a couple different instances. Any suggestions? I think I might just keep them separate and open a new tab for the explanatory session. thanks!

Reasoning: helping with the code behind a paper on explanatory AI systems.

Related code on github if curious https://github.com/pollomarzo/map-generation/tree/main/graph

EDIT: thanks for suggestions will give them a spin throughout tomorrow :)


Have you tried SingleSPA? I’ve used it in the past to get React apps inside an old AngularJS app to replace parts of it over time and it worked pretty well. Docs say it works with Vue as well but I don’t have any direct experience with Vue at all far less for this kind of task so I’m not really sure it will work but it is worth looking at https://single-spa.js.org/


Have you tried using `useRef` and attaching the Vue component to a parent React node? I haven’t does this with Vue, but have with a vanilla JS chart library.


>just displaying it would be fine

Iframes with postmessage() where needed (like dynamic window size changes) isn't pretty, but it's easy to do.


How to create a good app.store for a smartphone OS?

Users should be able to install whatever software they want. Similar for developers, they should be free to publish whatever software they made.

Apple/Google approach is suboptimal because centralized point of failure. And they do censorship on their stores, both political and arbitrary.

Linux approach is suboptimal because users don’t have keyboards to create these sources.list text files. Even if they had qwerty keyboards, I don’t like the UX, too hard to use.

Traditional P2P like bit torrent + DHT is suboptimal because smartphones, would use too much electricity + bandwidth to be practical.

So far, I’m thinking about developers-hosted binary packages, and existing code-signing infrastructure for authenticity and integrity (Verisign, Comodo, Digicert, these guys, up to developers to choose one). The configuration issue from Linux should be solvable with QR codes scanned by camera, plus a custom URI handler for the web browser on the phone.

The main thing I don’t like about that approach — a store app on the device is a good UX from end users’ perspective. Yet it seems impossible to make one with that approach.

I’m very far from being blocked on that yet, but I will face that problem eventually.

P.S. I’m not going to solve security at that level. As Android store shows us, it’s borderline impossible even with Google’s resources. Modern mobile SOCs have enough juice to solve that properly, on the lower levels of the stack. Most of them support hardware-assisted virtualization. All of them are fast enough to run proper multi-user Linux, with security permissions and SELinux kernel module.


How to get my kids to bed to bed.


By no means am I an expert, but just another parent on HN.

1. Ritualize

I notice a pattern with my kids: going to bed is a ritual, and any deviation is reason enough for them to leave the bed.

2. Slow down in advance

Going to bed after playtime is impossible. So cut screens, playtime, play/listen some quiet music, read books or newspaper (again, no tablet/reader), at least 1h before bed time.

3. Recap the day

Remind your kids of their day, activities and make them aware of the fatigue. Works better when in bed, with mine.

4. Stay with them if they're afraid

Learn why they're afraid, teach them why there's no reason to be afraid. I've had to hang a sock to the door every night for months to scare tigers away :D It just works ^^

Every parent knows the pain and every kid has their own back story and the relationship with the parent(s) is key to finding a way into bed.

Eventually, they will sleep.

In our case, we settled that unless exceptional situations, our kids had to fall asleep in their own bed, because we wanted/needed our intimacy. To get that, I had to stay in my kids' room for as long as 2hrs for months, but didn't let go. Today, going to bed is thankfully not a situation anymore.

I'm not sure this is in any way helpful, but here's my shared experience and learnings. YMMV.


Best trick I've learned (and I have more than twice as many kids as the average citizen here :-) is to:

1. make sure they don't fall asleep with something they cannot keep all night (i.e. while you are singing to them, rocking them, sitting next to them or when they are drinking a bottle of milk etc)

2. make sure they understand that even if you leave the room it is just temporarily. Small kids are - for good reasons - very afraid of being forgotten or left alone.

2.1 Using a timer to remember to visit the room regularly and often as they learn to sleep alone can help a lot

2.2. Increase the interval each day. I increased it by two minutes each day.

2.3 If the kids are happy in their bed, continue to visit their room at the scheduled time: you don't want them to think that you forget them if they don't cry.

Using this method I've got my last few kids to enjoy going to bed and sleep better in less than a week fo each of them.


For mine: A reading light with a remote-control timer. We read a story together, brush teeth, then they get 10 minutes of independent time with their own storybook. But kids vary, so good luck.


This book has a lot about parenting, including how to get your kids to go to bed:

https://www.amazon.com/Bringing-Up-B%C3%A9b%C3%A9-Discovers-...


in general aim for repetition and stringing activities together.

So for us, when I first started to do this. Each night they get a 'treat' but to get that treat they first need to be ready for bed - eg bedroom ready to sleep in, correctly dressed/ washed etc..

then after the treat they must choose a calming activity - ideally in their bedroom eg reading (nothing that gets their heart rate up) for 30-60mins then they must bush their teeth

It's this point we say time for bed, but we allow them to carry on reading for another 30-60min then it's lights off

if they don't do the activities/ actions after the treat, then we warn them that they'll not get one tomorrow etc. (and really do what you say)

also you may need to flexible on the activities until they get into the swing of it


If they are young < 3years just run hairdryer song from YT. I give 5 mins max. You are welcome.


Heu, raise your voice


Try the 28hour day. Once your kids are sufficiently tired, they'll go to bed by themselves.

https://xkcd.com/320/


... not a parent?

(The only thing worse than trying to get a child who isn't tired to go to sleep is a child who is too tired to go to sleep.)


... I've got two! (0.3y, 2.7y)

Apparently I also still have a sense of humor. Or maybe I don't, because perhaps pretending one doesn't get the joke of doubling down on xkcd silliness is perhaps a joke in itself, which I didn't get.


Benadryl


Benadryl, no. But melatonin, sometimes, yes. My rule is to have a hard cutoff time after which it's better to take melatonin than to continue the cycle of whining, sleep deprivation, and next-day misery. The cutoff is late enough to have plenty of time to try all the other things involving wind-down rituals. It is not an every day thing. I found that having the consistency actually helps establish the rituals too.

Also, bedtime trouble usually means not enough outside time and physical activity during the day. Or that the kids want more of the parent's time.


Look into the research behind melatonin use. I’m not a doctor and certainly long term use of diphenhydramine is associated with neurological problems in old age, but I’m not sure melatonin should be used as a simple hypnotic as you are suggesting. It’s natural but so is testosterone. Hormones may not be good to tinker with. I say that as a long time user of melatonin. At the very least you may want to stick with lower dosages- nothing over 1mg which is about as low as is easy to find.


Kids versions don't come in anything higher than 1mg and one can make it a half-dose quite easily. It's really more of a last resort thing, and definitely not for every night's bedtime. How last resort? Maybe once or twice a month. Now that my kids are a bit older and bedtime rituals are established it's even more rare.

I realize that some parents reach for it every night and this is not something I'm suggesting.


I could really use some help with adversarial attacks.

If there's someone trying to use CV to recognize stuff and we are trying to prevent it (basically, a black box situation) - is it viable to use adversarial attacks at all?

Will it work long term? Can they overcome our AA by downsampling and adding noise? Can we make another AA that would still prevent them using CV? If it's better to chat elsewhere - my Twitter is in my profile.


If you have access to their model it's possible, and adversarial attacks can be made to survive different sorts of processing. But like all piracy prevention tools it's a game of cat and mouse so if you're working against an intelligent adversary it'll come down to how many resources each of you can contribute to this fight. (Btw your twitter is not in your profile. Not that I have anything to add in private, just thought you should know if you're relying on it.)


No, the case is not having their model. I understand it's a game of cat and mouse, but this one case is gonna have a very quick iteration cycle, so I'm not sure if it's worth trying at all, it's not the same thing to test an adversarial attack on a self driving car once.

Thanks, it didn't update my profile for some reason, fixed!


every week I read about autonomous driving adversarial attack, so I would say it is beyond feasible.


I've been working on applications of network flow theory and ecological measures to macroeconomics to determine mesoeconomic scale measures for use by decision makers and businesses. These measures use certain properties to determine both stability due to sudden shocks to the economic system as well as what various policy actions will have on the overall health.

This includes entirely different ways of visualizing free trade, etc.

I have made some significant progress in the past year, but I am running into this headache where the papers I need are not easily accessible (or cheap) since I'm not an academic and I'm literally doing this as a for fun side project. I've been trying to find a university or college willing to take me on in some capacity simply to let me get access to academic catalogues, but I am not having that much luck unfortunately.

The best results recently though include some really fascinating stuff that comes from statistical measures that give clarity to how a region would respond to being isolated for any period of time.


What are the resources to look at for designing the architecture of a scheduling system (for coroutines/threads)? Would I first look at how Operating systems implement it? Or how VMs like Erland implement it?

Same question for effects systems, where would I look to understand how they're designed and the trade offs for their design decisions?


It indeed makes sense to look at on OS. You might want to start with a simple one, like FreeRTOS - which is more or less just a task scheduler at core.

I would recommend not to start with coroutines in the beginning if your main focus is scheduling. In the end coroutines and async/await is about building a userspace scheduler on top of a scheduler that already exists in the OS, so you just get twice the amount of logic. However the schedulers used in userspace are often a lot more trivial than the OS ones, since they don't support preemption or priorities. Erlang might be the exception and an interesting thing to look into.


So don’t start with coroutines and just look at OS scheduler and possibly the erlang VM?


Everyone starts with callbacks, moves on to coroutines, and eventually ends up writing polling loops. Can take decades for each person to work through it.

Erlang's messaging may have potential as another alternative


By polling loop, do you mean like a while loop that checks a queue for work at a preset frame rate?


Go is a modern take on coroutines (M:N threading), when everyone else seems to be going the async/await route. Though, I don't know how simple that scheduler implementation is (or if there are books about the internals).

Since the Linux kernel has pluggable schedulers, that code might be well-structured for reading. Again, I don't know the specifics.


Need help to decide the tools to be used for the below problem:

The system is a bunch of batch jobs that are scheduled to run at different intervals. These jobs can be modelled as an acyclic directed graph of steps. They basically download files from vendors and map the rows inside them into a generic format (for generating reports). There are a lot of vendors and each vendor can have a different file format containing different fields -- hence requiring custom business logic to populate (map) the corresponding generic file (like aggregating fields, fetching values from DB, etc.). Also these vendors' files sometimes contain errors, or are dropped late for download, etc. -- failures can happen and these failed instances of jobs should be able to rerun.

Existing system is built using Spring Batch and Spring Integration. The problems with the existing system are:

1. there are more than 200 jobs and most of them have their own custom logic during mapping -- cannot be generified

2. lot of manual work needed to onboard new vendors

3. jobs are synchronous and run only on one node, typically for lots of hours

4. rerunning jobs is a nightmare

Dream state for this system:

1. Dynamically add jobs to the runtime using generic components that can be reused -- maybe through an API / UI

2. Preferably, multiple records from a single file be processed across distributed nodes to generate a single output generic file

3. Rerunning should be easier

I am a noob to CS. I did a good bit of research for the past month. Found a few data-science tools in Python -- which is a no-no for a production system. Also, I know that the steps cannot be made generic after some extent since custom mapping logic is required for almost every vendor. But asking to see what is possible. Any help to point to prospective tools and technologies to solve the above will be much appreciated.

Thanks


Use Airflow maybe?


Looks very promising. Can I add new jobs (tasks in Airflow's jargon) reusing my custom steps (operators in Airflow's jargon) during runtime? Also, is there something similar in Java, Go, etc.?


I'm trying to make social media moderation more democratic and using that to decide fuzzy questions like "should this post be censored", or "is this misleading" [0]. While the crowd's answer won't be perfect it will help sort through a lot of the noise and feel better than the decision of whatever mod happened to create the subreddit.

The problem: how can I make decisions based on a sample with a binary question. I think the central limit theorem applies, and I need to account for various priors and missing votes. Is there an existing solution to this problem? The server is written in nodejs, if that matters.

[0] - https://efficientdemocracy.com/about/what-is-this


I worked on a similar idea last year. What I did was take urls to content, scrape the content, and pipe it through a machine learning a evaluator to apply various labels and warnings to content. Lastly, add some nice embeddable UI to surface the report.

I got it to a decent state, but didn’t know how to propagate it or inject it into social communities. I wanted people to be able to tag it on Facebook, and it would reply with an informational card with the analysis and summary.

https://github.com/dino-dna/informed-citizen


Cool! Was it supervised learning?

I feel like machine learning isn't at the level where it can tell if something is misleading, unless it's from a known sketchy source.


This might be a good use-case for the "bayesian truth serum" http://economics.mit.edu/files/1966

This applies when our questions are not just trying to learn about the world (e.g. 'Our survey discovered that 10% of posts are considered misleading'); we are going to use their answers to decide on actions, e.g. removing posts, attaching warning labels, etc.

Those answering the questions know this, and (if they have a preference over which action will be taken) are incentivised to give more extreme answers. A classic example is an ice cream company surveying shoppers about the flavours they like: if I truthfully answer that I like chocolate slightly more than strawberry, this will have a small effect on the survey result, and hence the company's new product flavours. However, if I falsely say that chocolate is the best flavour I've ever encountered, and that strawberry makes me vomit, that will have a much stronger effect on the survey result, and make it more likely that the company will make the chocolate ice cream that I prefer.

The "bayesian truth serum" counteracts this by asking each question in two parts: there's the initial question we want answered, as well as an additional question: "how do you think others will answer?". For example:

- "I find this misleading" and "I think 80% of respondents will find this misleading"

- "I rate strawberry as 4/5" and "I think 10% of respondents will give strawberry 1/5; 20% 2/5; 50% 3/5; 15% 4/5; and 5% 5/5"

The first answers (the ones we care about) are weighted based on two conditions: how closely the estimated distribution matched the real answers, and how 'surprisingly popular' the first answer is.

To see why this cancels-out the incentives to lie: our best chance of affecting the result is to choose a 'surprisingly popular' answer, since this will contribute more weight to the result. However, these two constraints exactly cancel out:

- The answers we predict are popular, will also be those we predict are unsurprising (after all, we could predict them!)

- The answers we predict will be surprising, will also be those we predict are unpopular (that's why it would be surprising if they were popular!)

It turns out that the rational strategy, for swaying decisions as much as possible towards the outcomes we want, is to answer the first part truthfully.

A similar analysis applies to answering the second question (the estimates) truthfully. In that case there are two things to consider:

- We want our estimates to be as close as possible to the true distribution, in order to maximise our response's weight.

- We want to engineer our estimates such that the answers we disagree with get a high estimate, and hence appear 'unsurprising' (reducing the weight of those responses). Our estimates must sum to 100%, so decreasing the 'surprisingness' of one answer must increase the 'surprisingness' of the others. The effect we have on each answer's weight will be small, but it will affect every response which chooses that answer. Hence to have the largest impact, we need to decrease the 'surprisingness' of those answers we think will get the most responses. Yet that exactly what we've been asked for (an estimate of how popular we think each answer will be!)


That's a very interesting system! I've been reading about it but I'm not sure how it applies.

> (if they have a preference over which action will be taken) are incentivised to give more extreme answers.

With yes and no answers, how can answers become more extreme? If you are asked "Is this misleading? Yes/No" and it's only marked as misleading when "Yes" is the majority, then you are incentivized to answer with your true opinion. If you want the post to be marked misleading, then answering yes increases the chance that it is marked as such.

The bayesian truth sereum information score makes sense when you are trying to reward people for truthfully answering your questions; for example by paying them [0]. When asking "Is this misleading?", how do you use the information score to compute who won?

[0] - http://www.eecs.harvard.edu/cs286r/courses/fall10/papers/DW0...


For yes/no questions I think (but haven't checked the math) that the incentive is to shift the distribution closer to my opinion. If my opinion is, say, 75% that it's misleading, then the truthful response would be a coin toss with bias 75%.

However, if I know my answer will affect censorship, etc. then I may try to predict the resulting distribution, and vote yes if I predict it's less than 75%, and no if I predict it's more than 75%.

For example, I may be more "trigger happy" if I think people are more likely to believe something uncritically; I may be more of a "devil's advocate" if I think something is under-represented, or less likely to be taken seriously.


> how closely the estimated distribution matched the real answers

> A similar analysis applies to answering the second question (the estimates) truthfully.

How does this avoid (or compensate for) downweighting the preferences of people who are legitimately ignorant about what everyone else thinks (and consequently give estimated distributions that hardly match the real answers at all)?


The weights can incorporate a factor (0 < α ≤ 1 in that link) which adjusts the contribution of the prediction's accuracy. When α = 1, we get the zero-sum, purely competitive situation; we can make accurate prediction less important by choosing α < 1.

Although truth-telling is a Nash equilibrium of this setup, it's not the only one. However, as α → 0 the truth-telling equilibrium becomes dominant (i.e. achieves a higher expected payoff).


TLDR: What is the state of the art for one-shot or few-shot longitudinal (time-series) machine vision tracking of object boundaries with ~ pixel (~ 10 μm) precision?

Specifics: I'm tracking the edges of the knee meniscus from time lapse video (~ 1000 frames) to measure its deformation under load. This is in the context of research to prevent osteoarthritis. Due to material rotation and irregular geometry, background edges that started off occluded come into and out of view over time. This tends to confuse both machine vision algorithms and human labelers. Because the tracking is for strain measurements, the tracked edge must be the same in all frames; therefore, the tracked edge is be a foreground vs. slightly less foreground edge of the same material (low contrast), not foreground vs. background (much easier). Only few-shot approaches are likely to save time because only ~ 20 specimens are needed to accomplish the immediate objective, and follow-up experiments will probably differ enough to require re-training.

The current plan is to Google "few-shot image segmentation" and try things until something works or the manual labeling effort finishes first, but maybe one of you knows a shortcut. Work is also ongoing to bypass the problem by enhancing edge contrast or using 3D imaging, but machine vision would be the most cost-effective solution.


this more like a tool suggestion request, because I haven't been able to find a solution in Google.

I'm looking for a more advanced duplicate files finder in Linux, specially one that can handle folders.

Most tool just return a list of duplicate files, but what I need is to also know is if whole folders are duplicate, or subset of others and have it presented in an elegant way t o resolve conflict. ofc everything is in a big folder and everything is a mess, bit of a hoarding collection of files that got accumulated over the years.

I could probably code some stupid script myself, if I ever got the time (spoiler I probably won't) but I have no idea on how to present the result elegantly. So it would be nice if such tool already existed.


https://meldmerge.org/ might be what you are looking for. But if there are tools which can find duplicated files and directories which are named differently, I'd be interested in learning more about these as well.


meld can compare two directory if they have the same structure and name for files but can't really tell if I inside my foot folder, I have a folder root/X which is also present as root/category/b/X. (except all the file in it have been prepended by "foo" somehow)

For tools which can find differently names files you have for examples: fslint (gui) or fdupes (cli)


That's true, meld is designed for comparing directories which already have similar structures.

It seems like what we're looking for is content-addressable storage [1]. The theory behind it appears to be based on Merkle trees and cryptographic signatures [2].

IPFS already has an implementation [3] of this, and there are other implementations (borgbackup, restic, zpaq) listed in that link and in the content-addressable storage article. Disclaimer: I haven't used any of these yet, just found them a few minutes ago.

[1] https://en.wikipedia.org/wiki/Content-addressable_storage

[2] https://gist.github.com/mafintosh/464bb8f1451f22c9e5c5

[3] https://discuss.ipfs.io/t/ipfs-and-file-deduplication/4674


Problem: Create image with text. Solution: https://img.bruzu.com?a.t=Text

You can help by trying the API and giving honest feedback.

https://bruzu.com


The API docs' example URLs can't be copied and pasted, as "%20" gets inserted. I'm probably not the target audience for the API, but it's a neat idea. I think people will end up with images described with tons of magic numbers that relate to each other in invisible ways and become unmodifiable and unmaintainable. Variables might help, as might relative positioning and sizing of elements.

Some feedback on the Designer:

The size setting dropdown is quite strange. Choices of unfamiliar destinations don't seem to make sense, and some of the things that do seem familiar come out an unexpected size and shape (e.g. "infographic"). The pixel sizes are clearer, but better would be handles on the canvas that can be dragged. There's also a typo: "Choose form a list of sizes".

Circles don't get resized? I can drag handles to make the apparent bounding box bigger, but the circle doesn't change: https://imgur.com/a/KhYluKj. Other shapes seem okay. Chrome 89 on Linux.

The tutorial walkthrough pops up every time I go to the designer, even though I've been right through it.


Thanks a lot for taking time to trying it.

Fixed most of it.


Seems like it would be a lot easier and a lot more powerful to use SVG instead of a giant string of URL parameters.

Your service could provide pre-made templates and an editor, and expose textfields, images, fonts, etc options via URL parameters. Then your service just has to render the SVG and return it as an image with the requested dimensions/format.

Example:

`https://img.bruzu.com?s=<TEMPLATE ID>&title=Hello&font=arial&width=800&height=480&fmt=png`

You could also pass the raw SVG source as a parameter as well, maybe with base 64 encoding or something like that.


Problem with svgs is that the text don't auto scale.


Thanks, yes template is in future plans, will think about svgs.



It seems kind of interesting, what are some use cases you've seen for it?


Use cases:

1. Image generation automation: Like into automation like posting tweets as image to instagram.

2. Image generation scaling: Create multiple images with just variable text, like greeting messages or product images or open graph images.


I am a bit ashamed to ask about such a trivial topic on HN, but I am not really sure how AJAX in Wordpress plugins works.

I have a plugin that exports some WooCommerce orders into XLS. I would like to add a progress bar via AJAX, because the export may take very long for thousands of orders. But I am not really sure how to use AJAX in context of Wordpress specifically.

I would love to see a minimal functional example, a simple plugin that does something similar. So far, all the plugins I saw were pretty convoluted and I lost my track around the code.

(On a related note: a library of elementary examples for Wordpress plugin development would be nice. Like "This is how you create a menu entry.")


First of all, you'll need to make PHP to display "progress", probably you'll need to override ob_start() or something like that, and find a format that let you append new progress on the response on the fly.

I guess you already have an URL on your Wordpress setup that triggers this export. Let's call it {url}/export.

Wordpress already has jQuery by default included. So'll you need to call that URL using jQuery $.post and then, accordingly to the response, update your progress bar.

There is nothing specifically about Wordpress on this, besides the fact that you need to setup your own URL on Wordpress to do this, and then include your own JS after jQuery. That's all.

If you find this too-complicated, a quick-hack is to create a page on WP Admin called Export Tool, and then on your theme create page-export-tool.php. That .php will be called when visited that Export Tool page.


In my experience, accurately reflecting the progress of and AJAX request, so I’ve seen a lot of people (myself included) take the lazy way out and just show an indeterminate spinner or bar just to show that stuff is happening.


It takes a bit of code but if you know the length of the response you're expecting then you can use XHR's "progress" event. Just be aware that the event will happen frequently and contain all the data so far, to avoid inefficient parsing and substring related memory leaks. I think his problem might be more about using JS in WordPress though.


The idea entered my mind, but I am not happy with such cop-out, especially on my own site ... I would at the very last like to see iteration progress, which, while not 1:1 with time, is at least informative.


https://free-visit.net : Like Matterport but with an FPS game engine.

I am looking for my first client: Ideally someone in charge of a Museum/gallery or other grandiose indoor space.


Looks pretty slick. I am not a gamer, and the controls feel very backwards to me. You need to hold the mouse button to look in different directions, which makes it feel like you're dragging, but the direction you move it isn't the direction you're dragging the view. I don't think mouse capture is a good idea in the browser, and if you're aiming at museums and galleries, maybe reversing the mouse direction to make it like familiar dragging would be better.

Edit: There's a bug where if you start dragging with the mouse and let go the mouse button outside the 3D view, it acts like the button is still held down (a bit like mouse capture) which was easier but quite confusing.

It seems unusable with a touch screen on desktop.

The fact that you can fly is useful but non-obvious. I ended up down at floor level and wondering how to see the pictures on the walls.

The demo is in a /fr/ path, but is in English (Chrome offers to translate it to English, because it somehow thinks the English words are French), but then some parts of the interface like "Share your place" are in French.


Thanks for the usefull and very professional feebback.


I reiterate the sibling comment about mouse control, either use the actual mouse locking APIs of browsers, or make drag feel like drag.

Additionally, I think it would be best if by default you were stuck to standing head height, then you can either provide buttons to actually move up or down, or lean more into the game aspect and allow the user to jump. Right now it feels like you are floating around with a little drone or something.

In the vein of controls, please please please support WASD too. I understand if you instruct with arrow keys, since for non-gamer users it might be more obvious, but support WASD (or equivalent of what WASD is in QWERTY keyboards) anyways, for 2 reasons: it is much more ergonomic for people who use a mouse on their right hand (I myself am left-handed but use the right hand for mouse anyways), and is more ergonomic for some laptop users, since many laptops have half-size vertical arrow keys which are uncomfortable to press all but momentarily.


Yes. Thanks for the long feedback. The controls keys are not perfect yes. WASD to add to arrows ? Yes if not to hard too dev.

As for the head, yes you are right : I should add 'something' that tilts a bit the head up/down when needed.


Agree with another commenter about camera height. Couple more things.

1. Your demo level suffers from Z-fighting in a few places on the floors and walls. https://en.wikipedia.org/wiki/Z-fighting

2. When viewed full-screen on a 4k monitor, textures are too low resolution. A handwritten note on the wall is unreadable.

3. Lighting is too simple. Because that’s not an FPS shooter you probably don’t need dynamic lightning nor day/night cycle, but it’s still hard. Ideally you need these multiple PBR textures everywhere, and correspondingly complicated pixel shaders.


1: yes, improving it.

2: yes but it is a tradeoff between texture quality and minimazing loading time.

3: Yea, ideally. But KISS is my priority : We have an editor that aims to be simple enough for all, thus no BPR & no shader.


2 – can you possibly replace them with higher resolution ones after the scene is already running? Ideally, gradually with a blend over ~1 second.

3 — I see. Still, you could pre-compute local illumination automatically in the editor, and bake it somewhere. Maybe into vertex attributes, maybe into another lower-resolution R8_UNORM set of textures.


2- --> The person Who builds the space in 3D With free-visit 'builder' decides how much he squeeze down the texture quality. His choice depends on the target: Smartphone (low quality), computer with big screen (high quality texture)

3- --> I will see with client feedback. I do not want at this point to over-engeneer free-visit. First I must find my market.


Hangs when I click "Try the example".


Just Refresh the page : it will succesfully play (known bug)


Now it works.

I wish AirBnBs and hotel rooms would offer this type of preview of their premises.


I've been working on a distrbuted Layer 2/4 Load Balancer (like Katran, but no C++ involved) that's mostly complete but now needs some testing with large workloads (1m+ concurrents, >50Gbps). Guinea pigs sought.


I’m working on OCR to recognize the scores on pinball machines. I have about a quarter million photos encompassing every model of pinball machine in the world but I just don’t have the know-how to accommodate all the font styles.


Have you tried an off-the-shelf solution like Tesseract? It works quite well if you do the recommended preprocessing.


The preprocessing suggestions I see are to crop out everything except for the numbers and I don’t know how to do that programmatically. There’s many kinds of displays: rollers, 7-segment, dot matrix, and LCD.

The preprocessing to increase DPI to 300 did not help when I tried Tesseract, unfortunately. It’s hard to achieve a good contrast between the numbers and the backdrop


There are a lot of other options and preprocessing methods you can use to get better results. It's hard to tell without seeing the picture but thresholding/binarization might help with the contrast. In order to isolate the text, the mode option also makes a lot of difference: https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html#...

If that doesn't work you'll have to add a text localization model to your pipeline.


Thank you for your guidance. I will investigate further


I’m planning on implementing a transactional (ACID) key-value storage on modern hardware, i.e. SSD and large RAM. The technical choice on question is the type of index storage to use: b+tree, extending hashing, or linear hashing.

There will be an appending WAL for batching the writes and for transaction support. A single checkpoint worker will apply the updates from WAL to the index storage periodically. The updates will be batched, to minimize the random seeks.

Please help me evaluate the three indexing approaches, on the criteria of fast read, fast write, cache friendliness and ease of supporting atomic update during checkpoint.


Ok, I'm double dipping. Another problem I'm trying to solve is: I've got a database of several hundred interesting conversation questions that I've collected over the years. Essentially just strings, though I've attempted to categorize them, rank them, and add other metadata. I'd like to figure out a way to sort them or dedupe them based on semantic similarity, but I'm not sure how to determine semantic similarity without painstakingly going through and manually looking for similar questions. Any suggestions on how to solve this would be welcome.


Put the questions in some semantic embedding space. Now you’ll have a vector representing each question. Then for each question, you can sort all the questions by how far the Euclidean distance is between their vectors. Or use some clustering algorithm like k-means to find clusters.

By web search I found this to tutorial to put sentences in an embedding space: https://github.com/BramVanroy/bert-for-inference/blob/master...

I did not read this and am not endorsing it, but it looks like it’s doing roughly what I’m suggesting.


Yep, exactly this. Check out sentence-transformers https://pypi.org/project/sentence-transformers/0.3.0/, they have some great pre-trained models. Once you have the embeddings you can just compute the cosine similarity.


You could use BERT or Word2Vec or GLOVE. They are very simple to use with HuggingFace's library.


I've been thinking about two interesting problems.

First, differentiating code to build a client side predictor with privacy as a consideration. I have code that describes how to translate domain messages into state changes, and I'm trying to figure out how to predict the effects of sending a message on a client even though the client has imperfect knowledge.

Second, AI for games with imperfect information. Specifically, how to build an AI for Battlestar Galactica.

These are in context of http://www.adama-lang.org/


https://share.securrr.app

A secure client side encrypted document (Passport / Id) sharing service

Needs a DSVGO lawyer as part of the team.


I like this. I dont know if it is your goal, but I'd like to see you (or an alterantive) to succeed and be used by big companies, a la Stripe. Need to share a document? No need to reupload, it's already part of Securr. Share with the company for a limited period of time, Securr make it hard to copy and to store (and illegal), etc.


I love that idea, you should submit that on frontpage.


Can't. As the legal framework does not exist yet. That's why I need a lawyer in the team.


You could potentially submit it as a Show HN and make it clear you very much need a lawyer to take the next step.*

Your call, of course. I'm just a random internet stranger spitballing ideas here for how to take that next step and I know next to nothing about this problem space.

Best of luck, whatever you choose to do.

* Make sure you read the rules for Show HN and use your own judgement there as to whether this qualifies.

https://news.ycombinator.com/showhn.html


Frontpage?


I meant submit it on HN as its own post.


I am trying to design a solution to something I encounted a lot during my hardware development days - keeping track of inventory of small and large parts alike. Basically a shelving system with an app which a user can search and the shelf would illuminate the correct bin.

My problem is, would this be a viable product for companies? It was always a hassle when I worked in labs but I am not sure if it is a big enough problem to design an entire IoT product around.


Sheesh, there's so many things I'm miles from solving this was a tough one.

It took a while but finally came up with something where it was actually closer than I thought.

Thanks for the inspiration.

Solved it.


I’m working on a MEMS problem. I need to be able to 3D print micron (or even sub micron) scale features for a very high aspect ratio MEMS which uses electromagnets to actuate. Only problem is the leading candidate technology EFAB (electrochemical fabrication) only works with conductive materials.

Does anyone know of a technology which provides similar abilities to EFAB but can also print features with non conductive materials?


Would something like Nanoscribe [1] work for you? The resulting structures are non-conductive. As part of our research we use it mostly for 3d-printing nanoscale optical elements, but we have also successfully used it for some mechanical support structures.

[1] https://www.nanoscribe.com


Not sure it gonna help, but some DLP 3D printers are very accurate, and most resins are non conductive. I know researchers were using one for micro-fluids applications. Not sure they can print sub-microns precisions, but 10μm is very real.


Could you use a conductive material, then deposit an insulating coating on top, or oxidize the material on top, making it an insulator?

Aluminum can be anodized, then sealed. As can a number of other metals.


I have an interesting challenge:

I want to add IoT tools in several apartments and allow my guests to control them through an app. Currently I already have an app, but no IoT integrations (basically just reservation).

How to do it safely? Also, which automations would be nice to have but not so obvious?

Thinking from things like light control, but also checking the apartment energy consumption in order to detect appliances that need maintenance before they break.


Web site building complexity, for now I started with a simple static site generator https://mkws.sh/. Right now, I'm not using any package manager, no config files, only one language for templates (sh), and obviously HTML, CSS, Js. I eventually plan to develop a simple CMS based on the same ideas. Ideas and code are welcome!


Hey there, how can I download/install this? I've been meaning to but I couldn't find another way besides https://mkws.sh/mkws@4.0.11.tgz which currently throws a 404 not found error. Any ideas?


> (sh), and obviously HTML, CSS, Js

Cut out the sh dependency and just use the "obvious" tech; make the site capable of generating itself, without reliance on any other tools.


What do you suggest for templating?


ES6 backtick strings.


I believe you misunderstand, I'd rather distance myself from the NodeJs ecosystem and use standard UNIX tools for developing web sites. I believe sh is great as a templating language.


I didn't say anything about NodeJS. (In fact, that would be adding a different dependency, not reducing the count by one.)


So how would you interpret the ES6 backtick strings?


The "obvious" way. (The same runtime that you're planning to use for the JS you mentioned in your original comment. <https://news.ycombinator.com/item?id=28349504>)


And do the generation on the client side?


If by "client side" you mean in the web browser of you, the author of a new piece of content, then yes—a site that is "capable of generating itself". (If you mean templates that are evaluated in the browser of every site visitor every time they refresh the page, then no.)


Ah, yeah, finally got your "capable of generating itself" idea. Pretty cool, interesting to experiment on it. I guess it would be something like a site that downloads itself also.


But then you would have a Js dependency.


Have a look at stuff like Netlify CMS, or any of the already existing website builders like WIX or Weebly or squarespace Etc.


Any headless CMS would work well my `mkws`, my idea is to build a smaller, simpler Wordpress that also comes with a tiny webserver, 0 config, no database, content stored as plain text files, just download and run. https://getkirby.com/ is closer in concept, but without the PHP dependency.


I have two app/websites ideas, one is a way to track your food and alert you when your food can spoil ( and sugest recipes with what you have in your house ) And a learn helper website/app that will give you paths to learn thins and register, track and help you through this process The first one is on the planning side, the second one already have some pages written in python/django


I have a graph with weighted edges. I want to remove edges to make the graph colorable with N colors (e.g. N=40) such that the total weight of removed edges is minimized. If I'm able to solve this problem, that will complete a project I've been working on for years now to make a working keyboard for a person I know that has cerebral palsy.


Here is an algorithm that could work and run reasonably quickly, although it might not be optimal:

1. Find all vertices with degree N > 40 (eg: find all points in the graph with more than 40 outgoing edges).

2. For each pair (a, b) of these degree N > 40 vertices, find the set of common points (c) connected to both a and b by edges, eg: there exists 2 edges (a - c) and (b - c). In essence, you're forming V shapes (a - c - b) where the two tips (a, b) of the V have at least N > 40 outgoing edges.

3. Identify the pairs of vertices (a, b) that are connected by 40 or more Vs (eg: there are at least 40 pairs of (a - c) and (b - c) edges).

4. Remove the (a - c) or (b - c) edge with the highest weight. If (a - c) and (b - c) have equal weight, remove the edge which is connected to the node a or b with more outgoing edges.

5. Repeat steps 3 and 4 until each (a, b) pair has at most 39 Vs connecting them together.

At this point, I think you can color the graph with N=40 colors (one color for a and b, then different colors for each in-between point of the 39 Vs between them).

There might be a way to improve the criteria for which edge to remove in step 4 (maybe using the backtracking approach mentioned by other commenters), but this should be a decent starting point.


Similar to what another commenter said about time, have you tried a backtracking approach by: (1) coloring the whole graph (let's say you end with 42 colors) (2) start with the highest colored nodes (e.g. N=42, which exceeds your threshold of 40) and (3) greedily remove the lowest-weight edges (or maybe, edges to other high-colored nodes) until you are either all colored with 40 colors or we reach an invalid state (ie node no longer in the graph) -- with the latter you can add another edge back and try again by removing the next-lowest edge.


That's what I'm doing now, but the results aren't great. If there's a way to estimate a lower bound on the number of edges to remove, I can figure out if the results aren't great because of the approximation, or because of the nature of the graph...


Graph coloring can mean vertex coloring, edge coloring, or total coloring. These are 3 different problems.

Regardless on the answer, I think this gonna be way too slow for your use case. Especially if you want the global optimum. Wikipedia says the current state-of-the-art is randomized algorithms, I don’t think these algorithms are looking for global optima.

I don’t recommend solving that, I think you’ll waste your time. How’s that related to the keyboard anyway?


Will the graph ever change? Is this a one-time computation?


One time.


Also... is it possible for you to add nodes? Ie split a N>40 node into two nodes.


No, though nodes can be deleted -- at a cost equal to the total cost of all the edges that contain it.


Could you put the graph up in a gist/pastebin?


What are the needs in time?


Can take as long as needed as it only needs to be colored one time. About 10,000 nodes, reasonably dense (about half the nodes will have 1,000+ edges).


What did you try to color the graph ? CSP ? Classic backtracking ? With how many colors it must be colored ?


I have a custom tlv encoding standard. It has 1 or 2 byte of header depending on type with the required info in the header. I have written an encoder and decoder for it, but now want to do it via ASN.1. But, I am not able to understand how to define header bytes and what rule each bit denotes in the header. Do I need to use ECN?


I have to wake up and pee 2-3 times a night. I put pillow under bed on my leg side and make leg side up. It is working so far. Every 4-5 hours from every 3 hours. I also trying Kegel exercise too. I can't stop drinking water before bed because I have acidity too and I can not sleep with dry throat :)


I have a pile of mp3s and want to splice them together with a single ffmpeg operation. Essentially injecting multiple small audio files into a large one using time codes. I know there's got to be a way to do it, but I have yet to find a way to do it in a single operation instead of multiple passes.


Some god awful combination in a complex filter using atrim to pull the pieces, adelay to set the positions in the output, and amix to put all of the output back together could probably do it. What that command may actually be is definitely open but that's probably the only way that would work as otherwise this is really 2 separate operations (chopping then merging) so there isn't going to be consideration for having it in a single premade flag.

On another note if the goal is just to avoid files/writing to disk then a bunch of ffmpeg splices to named pipes as inputs to another ffmpeg command to merge them could do the same without the command soup.


I'm very fine with command soup; the plan was to generate it with a script and run it directly. The input files are in a lossy format so I'd rather avoid having intermediate WAVs or double-encoding or something like that.


ffmpeg -f concat -i mylist.txt -c copy output.mp3

https://superuser.com/questions/587511/concatenate-multiple-...


It's not quite so simple; I need to insert smaller audio files into a larger file at specific time codes.


Splice you parent file at the time stamps, interleave the samples, then concat.


I'm trying to avoid reencoding multiple times, and saving WAV to disk puts serious pressure on storage constraints


You could use FLAC as an intermediate lossless codec which should reduce storage costs by about 1/2 to 2/3 compared to WAV.


Are they sequential, or are you splicing at arbitrary times?


They might be spliced in at arbitrary times but multiple audio files may be spliced in sequentially at an arbitrary time. I'm not going to be upset if there's a millisecond of gap between them or whatever by just splicing at approximate time codes to make them sequential.


Context: Not from the investment banking or trading background. Have been an investor / trader with modest gains.

How does one solve for risk in the markets? As in, mathematically. How does one do short term predictions of prices, with a day or two as the prediction range, with a probability > 0.5.


I’m looking for methods (NLP or otherwise) that maps/generate a short description (few words or less) with a given format (e.g. 3 word tag with each word from a list of choices) from a paragraph or a sentence. Results would be stored and clustered base on their distance


I am not there yet, but I am trying to make education loan-free and based on equity. The details of the project are written here: https://loan-free-ed.neocities.org


> when that student starts generating income then a small percentage from that income is auto-deducted and distributed to everyone who was involved in that student's education.

Your idea, if implemented well, may end up being a net positive for society, but I can't help imagining a future where every child, from the moment they are born, has a biometric ID connecting them to a consortium of companies which provide their education, health care, housing, energy, internet connectivity, transport, media access, and so on.

It would be like living in a company town, being paid in company scrip, except you wouldn't notice the restrictions (as long as you kept earning). If you ever increased your income, your consortium might let you choose whether you want to upgrade your housing or your health care plan, but if you lost your job, they'd force you to take one of their choosing and downgrade your plans if it had a lower salary.

In this dystopia, all consumables from food to toilet paper would presumably be sold by Amazon, and other items like furniture and electronics would be provided as a service so that you rent them from your consortium. The only question is why people wouldn't try to undo this system through the political process, but then we might ask that about the current system.


> Your idea, if implemented well, may end up being a net positive for society

Thank you!

> but I can't help imagining a future where every child, from the moment they are born, has a biometric ID connecting them to a consortium of companies which provide their education, health care, housing, energy, internet connectivity, transport, media access, and so on.

My idea will not have that side-effect because not only is the project non-profit and open source, but it is also decentralised. And if we keep thinkng of a dystopian future then we won't be able to do anything positive unless we become some sort of social revolutionaries. I don't have those skills. But I can think of small ideas to make a positive impact that benefits everyone though. The idea I listed in my original post above is very simple, it helps teachers, lecturers, or anyone who contibutes to a persons monetary decent life via education get appropriately paid for their efforts, and everyone(i.e. businesses) who benefits from an educated person should contribute towards that.

It is a simple idea but notoriously diffcult to deploy because there is a possibility of this getting caught up in a political slugfest.


I want to leverage data provided by the healthcare system and create a machine learning algorithm that monitors the patient at all times by leveraging Iot data, vitals, location, all possible identifiable data without breaching hipaa etc...


Some ZK way to prove that a piece of data was derived from some source, for example, proving a human fingerprint is unique identifying biometric data without showing the data itself or the person it is from and with no trusted authority.


Just thinking out loud here: I think it would require trusted sources (not necessarily a centralized authority) that have validate it and that you trust.

You can prove two balls are different colors to a colorblind person by having them show you two balls of X color and proving to them that you can differentiate them (watered down example), but it requires validated externalities (ex, you can see colors they can’t).

Defining the external validators is the hard part.


If it did require a trusted source, do you think there would be a way for said source to 1) involve no human administration, 2) behave deterministically and 3) be resistant to attacks that could break either of the above 2? That would also solve the problem.


I'm searching for a (reliable) open-source OCR tool for Arabic text.

The best option I tried is Google Cloud Vision, but it's still not accurate enough, and it could get quite expensive for large tasks.

Anybody knows a good software for that?


Trying to find line item based medical bills and laws that govern what information can patients obtain about medical procedure pre and post procedures.

Goal is to bring transparency to medical bills and remove the unknown.


I've been dealing with needing to integrate an arbitrary area of a brane representing a density map.

It's been slowing going getting all the maths done.


How can we make CTOs better? How does one CTO learn? How a community of CTOs can help create value to one another?


Why should CTOs only learn from CTOs?

The average CTO is not as knowledgeable as you might think.

For example, there are CTOs today building applications using no-code platforms, without any academic or technical background. Those people would have much to learn from an software engineering intern at any company.

CTO is a job title, and each company can grant that title at their discretion. Or Technoking, or any other title you might think of.


That's exactly the point, and mostly CTOs don't have much people internally to ask help from.

I am in a community with some CTOs in Latin America and I see this struggle happen every time.


Latin America is a mess in many ways.

VCs there suck, and do not understand that VCs are about inherently risky investments. They want the guarantees of a low risk business with the profitability of a high risk business.

Leadership in Latin American companies also sucks. As soon as the company has any revenue, the leadership will go all out and spend it all on themselves on some extravagant lifestyle adjustments rather than reinvesting it on the company. And because of this, many companies stay small and mediocre without fulfilling their true potential. They waste their money on MBAs learning things they are unwilling to apply.

Then, there is nepotism. Hiring from your family equates creating conflicts of interest, creating situations where relatives of key people do not need to comply with HR, do not need to be competent, cannot be fired and create an unprofessional atmosphere.

Compensation in most Latin American companies sucks. They aspire to be like Silicon Valley startups and hang framed posters of Steve Jobs on their wall, but when it's moment to create compensation packages they grant zero stock and award zero bonuses, all while having the same work-life balance of a startup. They do not understand the key role of employee stock in company growth.

Why the fuck would you work for a wannabe startup that doesn't give you any stock or bonuses? Or work for some entitled aggrandized clown that doesn't understand the simplest technical concepts? And the answer is: because business owners in Latin America have never had to care much about their workforce. Because of exploitation and their informal caste system, having a happy workforce has never been a requirement.

That's why Latin America is undergoing a massive brain drain that will only get worse in the years to come.


This applies 1:1 to Italy, I wonder if there's a shared cultural/religious/economic reason. Doesn't seem to apply, for example, to Portugal.


I don't think your take is wrong, I live here and agree with most stuff.

Yeah, Latin America sucks, I already know that.

Now back to my problem...


How memes could be another Bitcoin.


I am trying to load in raster layers from file in OpenLayers, but have failed so far.


This isn't a technical question, but it's a problem I'm going to have to solve soon!

I just accepted an EM role at a FAANG. This is a career "boomerang" for me - I was an engineer in the past, but then moved into technical support management. I'm coming back to engineering, but this is the first time I will have done it at a large company. All of my engineering experience has been at small scrappy startups where we just did everything, did it fast, and prioritized by whatever was most on fire. I don't think I've ever actually done a proper "sprint". While I have written a lot of code, the workstyle on my new team will be almost entirely foreign to me.

Whose got pro tips for leading an engineering team in a large organization? What makes a high-powered team? What are the easy mistakes that will drive us into the ground?


I've been an EM at a startup and an F500. Don't be scared to have less meetings than your peers. I basically only do one 1-hour sprint meeting each sprint to review the last sprint and to plan for the next one. It may help to have standups daily at first but you can often make them less frequent over time.

Also, always avoid meetings to ideate. These are some of the most common and they are a huge waste of time compared to listing out ideas in a doc and having people review that asynchronously. And yet these meetings have a tendency to get called all the time. For example "there was a fire, let's get all the leads/EMs/directors together for 1 hour to figure out how to avoid this next time".

Your two most important duties are 1) making sure your developers are given space to implement what's important and 2) building relationships throughout the company to better anticipate future needs that helps you do #1.

Happy to chat anytime as well, email in profile.


Take a look at the book "The Manager's Path" by Camille Fournier. It has surprisingly concrete advice for people in your exact position. One of the few "business books" I actually recommend to people.


(Caveat: not an EM myself)

This guy made the switch from dev to EM and has written articles on running effective teams,(e.g. one on how to help devs act as project leads) and even has resources like the docs he sends to tech leads when they start a new project, w/ lists of responsibilities etc: https://blog.pragmaticengineer.com/things-ive-learned-transi... (just linking to the most directly applicable article but definitely browse around)


I'm looking for the best way to implement Multitenancy, Submultitenancy Impersonation with JwtTokens and IdentityServer4 ( dot net).

I'm curious on how other people solved it ( by cookies, subdomain, ... ) and if you used a JwtToken for it.


Looking for something like authzed.com?


I've got the database and query part covered.

But I haven't decided yet on the actually flow. Where I'd identify the current tenant or impersonate him.

+ The influence of impersonation on that flow.


Community owned and operated FOSS alternative to MetaMask.

Deciding which features to ship and what to skip is a bear — it's a big product design space


Please make this a monthly thread!


Totally! Maybe if no one does we should do it every 1st of the month.


adapting the Sunday variation of boyer-moore algorithm to search backwards from point


Short version: I'm close to figuring out how to encourage more prototyping of software by making tests super easy to write during the prototyping process, and so de-risking rewrites. But one problem I've been stymied by is how to represent expectations of screens when they contain graphics.

Long version: My Mu project (https://github.com/akkartik/mu) is building a computing stack up from machine code. The goal is partly to figure out why people don't do real prototyping more often. We all know we should throw the first one away (https://wiki.c2.com/?PlanToThrowOneAway), but we rarely do so. The hypothesis is that we'd be better about throwing the first one away if rewriting was less risky. By the time the prototyping phase ends, a prototype often tacitly encodes lots of information that is risky to rewrite.

To falsify this hypothesis, I want to make it super easy to turn any manual run into a reproducible automated test. If all the tacit knowledge had tests (stuff you naturally did as you built features), rewriting would become a non-risky activity, even if it still requires some effort.Turning manual tests into automated ones requires carefully tracking dependencies and outlawing side-effects. For example, in Mu functions that modify the screen always take a screen object. That way I can start out with a manual test on the real screen, and easily swap in a fake screen to automate the test. Hence my problem:

How do you represent screens in a test?

Currently I represent screens as 2D arrays of characters. That is doable a lot of the time, but complicates many scenarios:

* Text mode character attributes. If I want to check the foreground or background color, I currently use primitives like `check-screen-in-bg` which ignores spaces in a 2D array, but checks that non-spaces match the given background attribute. In practice this leads frequently to tests that first check the character content on a screen and then perform more passes to check colors and other attributes.

* Non-text. Checking pixels scales poorly, either line at a time or pixel at a time. A good test should seem self-evident based on the name, but drawing ASCII art where each character is a pixel results in really long lines or stanzas. So far I maintain separate buffers for text vs pixels, so that at least text continues to test easily.

* Proportional fonts. Treating the screen as a grid of characters works only when each character is the same width. If widths differ I end up having to go back to treating characters as collections of pixels. So Mu currently doesn't support arbitrary proportional fonts.

* Unicode. Mu currently uses a single font, GNU Unifont (http://unifoundry.com/unifont/index.html). Unifont is mostly fixed-width, but lots of graphemes (e.g. Chinese, Japanese, Korean, Indian) require double-width to render. That takes us back to the problems of proportional fonts. Currently I permit just variable width in multiples of some underlying grid resolution, but it feels hacky.

Can people think of solutions to any of these bullets in a text-based language? Or a more powerful non-text representation?


Your project is quite impressive. And welcome to the world of computer graphics!

I'd consider taking inspiration from the following sources:

1. GUI toolkits like QT QML [1] or Android [2]. These typically build a hierarchical tree of different components (eg: start with a root window, which contains panes, which in turn contain text and buttons). Each component may contain different properties (eg: font, color), and properties may be inherited from the parent component.

Advantages:

+ preserves semantics of component properties and how they are linked to each other (eg: the caption is below the image)

Disadvantages:

- complexity: building a layout/constraint engine can be difficult, or alternatively you can use absolute positioning with relative offsets which can be tedious to use (in this case the layer-based approach below might make more sense).

[1] https://en.wikipedia.org/wiki/QML

[2] https://developer.android.com/guide/topics/ui/declaring-layo...

2. Graphical editor programs like Gimp or Photoshop, or Adobe Flash.

These build up a screen as a collection of vertically stacked layers or assets (eg: graphics, text, etc) with attached properties and optionally bounding boxes. Higher layers/assets occlude the content of the layers below them, so you would need to implement some kind of visibility logic.

Advantages:

+ simplicity

+ you can use identifiers for assets, and therefore don't need to perform pixel-by-pixel comparisons.

Disadvantages:

- may lose some information about how different components are related to each other

Also, rather than a raster pixel-based representation, it might make sense to use a vector representation internally [3]. The most popular vector representation is SVG. The full spec is very verbose, so you probably only want to implement a small subset of it. This would permit you to specify properties like line thickness, color, striped/dotted patterns. At render time, you could convert the (proportional) fonts to vectors as well for consistency, and then rasterize the entire scene when rendering to a display surface. But for testing, it would be better to use the scene graph / vector format which is easier for users to reason about.

[3] https://en.wikipedia.org/wiki/Vector_graphics

[4] http://blog.leahhanson.us/post/recursecenter2016/haiku_icons...

But perhaps this is over-complicating things.


Thank you for those suggestions! Do you know if any of those tools following either approach has any automated tests? My immediate problem is how to manage the complexity of implementing a layout engine or editor. Somewhere I need something checking that a given asset identifier results in specific pixels. And I'd like the tests for _that_ to be nice to read. It's a bit of a chicken-and-egg problem..


Glad that you found this to be useful!

Android includes the Espresso UI testing framework [1]. Essentially, you can specify matchers that compare your expected values or predicates against an actual object identified by an R.id identifier. It's very powerful (since you can write your own custom matchers) but can be cumbersome to use [2].

[1] https://developer.android.com/training/testing/espresso/basi...

[2] Example Espresso Test: https://github.com/android/testing-samples/blob/main/ui/espr...

https://github.com/android/testing-samples

Alternatively, Squish [3] is a very polished and more elegant commercial testing tool that lets you record test-cases using a GUI tool and convert them into (ideally modularized) methods that verify object properties or compare (masked) screenshots of the GUI:

[3] https://www.froglogic.com/squish/features/

Demo video (starting at 14:24): https://youtu.be/ElH-3MVHPRw?t=864

They abstract away a lot of the functionality using the Gherkin [4] domain-specific language so that tests are easier to read at a high level (but you can still dig down into the underlying programmatic implementation).

[4] https://cucumber.io/docs/guides/overview/

This is probably too much complexity for your use-case, but may provide some ideas or inspiration for what is possible. Perhaps a simplified matcher-style system might be a good starting point though.


I'm working on generating code across the stack and languages from source of truth data models.

Low code for devs. https://github.com/hofstadter-io/hof

Trying to reduce redundant tasks and simplify changes with minimal effort.


Who are "we"?


I suppose it is just him and the driver.


Probably the HN participants ("Quickfire problems, quickfire suggestions")


I'm working on destroying proof-of-work blockchains. I have a plan for BTC, but I'm not sure how to approach ETH. Advice would be appreciated.


If you're interested in eliminating proof-of-work for ETH, you should really take a look at the proof-of-stake network in progress. Keywords: "Eth2", "proof of stake" and "the merge".

When proof-of-stake takes over, there won't be any miners. The block proposal process is done by stakers instead. Some of the incentive issues with miners still exist with stakers, but raw competitive power consumption isn't one of them.

It's true that proof-of-stake has been talked about for years, but it has picked up momentum since late last year, as the staking network was actually launched.

The proof-of-stake network has been staking real ETH since end of last year, but does not yet handle mainnet ETH contract transactions. It's called Eth2, but that's caused some confusion, because it's not really a second version to run alongside the first, it is the R&D branch into proof-of-stake and other technical improvements, with mainnet ETH expected to adopt it in due course.

So, the Eth1 components have been renamed "execution layer", Eth2 components renamed "consensus layer", and through a series of testnets and API developments which have been quite active this year, a big change called "The Merge" is being worked on by multiple funded groups (for client diversity) of core Eth developers at the moment.

The Eth2 staking network that already exists has demonstrated the viability, and the investment of real ETH in serious quantities has built up some cryptoeconomic stability prior to its deployment as the ETH consensus layer. The time lag is intentional - you don't want to suddenly switch all ETH over to a network with too few invested stakers.


The proposed PoS scheme for ETH will run afoul of regulators as soon as a big-enough crime is financed on ETH. I do not need to put effort into destroying those things; they are ticking time bombs. PoS separates participants into pigs and chickens, and the pigs could find themselves liable as money handlers.


> I do not need to put effort into destroying those things [proof-of-stake blockchains]

You originally said you want to destroy proof-of-work blockchains, and were looking for advice on how to do that with ETH. It's already being destroyed on ETH by proof-of-stake. You asked for advice, that's the advice. You don't need to do anything except wait.

Now you are saying you don't need to put effort into destroying proof-of-stake. Why is that relevant here? It suggests to me your goal is different from what you originally stated. Are you looking to see the destruction of more than just proof-of-work? The destruction of BTC and ETH, even if they switch away to another consensus mechanism?


Interesting, what approach are you taking with BTC and what's the reason that approach won't work for ETH?


I imagine overwhelming miners with legitimate but expensive work. The way to generate this work is relatively sensitive to low-level blockchain parameters. BTC happens to fit the approach, but ETH doesn't. This isn't surprising because the approach was originally invented for BTC alone, prior to the proliferation of many different blockchains.


for eth - just wait?


I'm trying to be the Amazon of real estate. If anyone is interested DM me


That sounds interesting, sadly I couldn't find a way to message you on here, do you have any other way of contact?


Please provide a way to contact you, ideally email!


LOL

Plenty of problems - none of them technical - all people problems!


I think these can be discussed as well. In the end thats the ones we suffer the most from. (edit: typo)


Yep, lots of problems are people problems.


I am working on two problems:

1. I want to create way to generate electrical power without pollution. Basically, a closed cycle process that releases no pollutants, or electronic waste.

2. I want to do everything I can to eliminate gender bias in the world.


Regarding 1, if you can accept some pollution at the beginning, hydro-electrical can be a solution, albeit probably not for a global scale.

We have a small hydro-electrical plant una River near my house and really it's no big deal, it fits very nicely in the surrounding environment and it produces clean energy.

It's also educational because since the river is near the city small children classes can visit it and learn about it.


Regarding hydro, you can go micro to power a house or some small comunity. There are lot's of books on microhydropower, but get a look at this fantastic post at ludens.cl

http://ludens.cl/paradise/turbine/turbine.html


Rather than generating electricity directly, it might be more practical to reduce electricity consumption using other approaches:

Geothermal can be a solution for generating electricity directly, but if you'd like to minimize electronic waste perhaps it would be easier to use it to replace alternative energy sources for HVAC purposes.

Biofuels (eg: plant bamboo, grow it, then burn it) can also technically be closed cycle energy sources.

Solar water heaters can also reduce electrical or fossil-fuel-based energy consumed for generating hot water.


I love hydroelectric power and I want the circuits I design for solar to be capable of utilizing the raw power from a small turbine as well without any hardware modifications.


You sound like a wonderful person!

What progress have you made in your work on either front? What sort of work do you do to solve these problems?


I worked for years studying how to use microcontrollers and after a lot of determination now I have a $750,000 grant to build solar systems that are fireproof so you can install them anywhere. It will be a few years of work to get something suitable made but I have full confidence it can be done.

I am also spending much time lately in the SF kink community to build a fundamental understanding of the biases people have experienced in life with respect to their gender identity, and am strongly considering HRT so I can live life on the other side and experience the prejudice first hand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: