1) Cleaning the data as it comes in rather than in batches so we can use it sooner, invalid data is discarded, outlier detection, normalizing inputs etc....
2) Warehousing of the data with proper indexes so you can perform some advanced queries on unstructured data
3) Some data is sent in bulk at the end of day, some of the data is streamed in fire hose style. How can we preprocess the fire hose data so that we don't have to wait until the end of the day to parse it all.
4) Oh and all of this data is unstructured and comes from 75 different sources.
Soon the average hedge fund will have more people just cleaning and managing data than they do in quantitative research, dev ops, software development and trading.
Oh and lots of the data is considered proprietary so while AWS/Azure, etc is fine, sending it to a third party to process is not.
Help me, I'm drowning in data. How do I get the time from when I acquire data to when I trade based on it down to a reasonable time frame, where reasonable is closer to hours rather than days/weeks.
We threw every rule out the window in the name of performance _when fetching raw data from external sources_. So we had weather station networks, NOAA forecast runs and NASA satellite data in a workable schema in our shop way faster than average. Mix of C, PowerShell, Perl, and the nonstandard parts of T-SQL, highly parallelized, tricky but fast.
After the "workable schema" was established, the rules came back and we acted more responsibly. Smart instead of clever.
Ran this stuff all day long, getting every piece of data asap. Things that can only be calculated with a full day of data we poked and prodded the meteorologists to express in "partial aggregates", which to me were just like the map steps before an EOD reduce.
Took a lot of mutual understanding and iterating but worth it in the end. When the ultimate data source (satellite or radar site for us) posted its last hour of data, we were 95% done with the day's computation work. We do our last step, publish our numbers, and bam, Our ag clients have this stuff a day earlier than they are used to.
It's a native streaming platform so your data will be cleansed, processed, scanned for outliers event-by-event rather than in batches. We have dozens of streaming connectors IT/Enterprise/Web data sources. We also support initial load for your firehose data. For unstructured data, we have support for RegEx based parsers.
Shoot me a message if you have any more questions. We have many big name users in Aerospace, Banking, Device manufacturing, and Logistics industries.
guarantee worst-case linear time matching for any
number of expressions
It's open source, and match time is linear in the length of the input string.
(Disclosure: I work at Google on a different open source project)
More like "mitigating" or "occasionally getting regex slightly more right than some other solutions". There are many different approaches to regex and all seem to focus on different parts of functionality (RE2 focuses on quick compiles and simplicity, libpcre has 'all the functionality', we're about streaming + large scale + high performance if you can tolerate long compiles and lots of complexity). A number of new projects are trying very interesting approaches, like icgrep and the Rust regex guys.
I would be curious to hear about your approach.
I have seen local companies working for months/years to finally use his BI package but the trouble at the step 1 is big (and also, to put the data in a "nice" schema).
The problem is that enter in this space is hard. Years ago I was at a company that have a niche product (in foxpro) for this kind of task, and I have dreamed about build something like this based in my experience, but get the funding for this kind of "boring" task is hard (more in my country, Colombia).
P.D: If wanna help, we can talk. I can't give a magical solution but at least I find this kind of "boring" jobs compelling ;)
And on the part about firehose data, you might already know this, but Kafka and their line of work should be aligned with what you're after.
Check out my company, SnapLogic: https://www.snaplogic.com/
I don't know much about Paxata but I think Trifacta are well-regarded in industry and academia. Trifacta founders worked on research / open-source-? project Data Wrangler http://vis.stanford.edu/wrangler/ and turned it into Trifacta.
I don't know much about either product in truth.
Open Refine seems to be the best product in this category. I haven't used anything but my own tools to do this before, so I can't really offer any advice.
My understanding is that this is just a fancy way to talk about a database with a schema designed for analytics. There are many open source databases which do this very well, the one I use being Cassandra (and/or KairosDB), though it is also likely the one that is hardest to use. For a beginner, you might want to refer to this SO answer: http://stackoverflow.com/questions/8816429/is-there-a-powerf...
This is something that is incredibly dependent on the data sources, so I likely can't tell you anything that will help. Most of my data sources I've worked with have been internally sourced log files, messages from ZMQ, or CSV data - you might be working with something far different though, since there are lots of public data sets and such which are common. Ideally, this would be integrated into the tools that you are using to clean the data, but I don't know if that exists.
Handling input from many different sources at different rates is not a very hard problem to solve if your system is build correctly - you could for example run a daemon for each data source which will populate the database when there is new data available, then send a message off to the processing engine, which will integrate the data into whatever reports you are running.
Specifically for a use case of a hedge fund, the reports could be triggered by a message which is sent when the new data is available, and processing could be done in parallel in Lambda or similar dependent on need to get a nearly instant return, enabling nearly real-time reporting.
1. The UX is subpar. It insists on running in only a single tab at a time, and attempts to open multiple tabs will instead override whatever it considers to be the master tab. This is a huge pain, because I often need to have a mapping workflow open in one window and some other relevant part of the application open in another. Instead I have to save, go find the thing I want, and go back. Another problem is that when working with data sources containing tons of fields, there's no easy way to search.
2. It offers an expression language to perform some computational tasks, similar to what you'd find in Excel, but it's hamstrung by a poor UI and a limited amount of functions. The built-in editor for expressions is really poor (see Tableau for an example of a great editor for a simple Excel-like language; it even has type linting) and, unless I've misunderstood something, you can't declare any variables so you end up with huge nested expressions. There aren't many functions available, so something as simple as removing whitespace ends up as lstrip(rstrip(foo)). In combination with no support for statements (or at least a let expression like in lisp) this makes any nontrivial data munging completely indecipherable.
I've looked around in this space and it seems like there are a variety of products, but the supplier of our main CRM will only support Informatica Cloud. I think that a company that can offer a product that does what you've said but makes a serious effort at UX could cause users to revolt and demand to use it! I know the joke is that Slack is just a pretty IRC with better UX... but that's exactly why it has become so successful.
In terms of data munging, take a look at Microsoft's Power BI. It's visualization software but it has a nice data munging mode that, crucially, keeps track of all the changes you make and displays them in a linear format. This is great for getting a quick idea as to what was done with the data and is essential for doing reproducible data analyses. Unfortunately, Power BI also suffers from poor UX in insisting on tiny fonts and gray-on-gray palettes that are totally unreadable to anyone over 30.
Omg, so many sales pitches. You should figure out which of those were automatically generated by someone who's bot is crawling HN and using NLP to find posts like this, and then hire them. There's basically 0 chance that isn't happening...
Sounds like a job for Apache Camel?
The same line of inquiry has been evaluated for most 3rd party software that companies rely on. For this specific instance of data collection and cleaning, I'm imagining it's not going to be a much different calculus, although perhaps you'll see a higher percentage of firms choosing to roll their own if they have the chops and pockets (e.g. Two Sigma, Bridgewater, Goldman Sachs, etc.).
I will note that there are commercial mechanisms firms could try to implement to try to limit the downsides in case something like this happens: warranty & damages provisions, and insurance are two come that spring to mind. I'm sure there are numerous other considerations in the age-old "build or buy" cost-benefit analysis.
Problem solved on all points.
100% fully functional, fully featured, 100% free download right from the website.
Disclaimer: I work for splunk.
Everyone I know in law is dissatisfied with every part of their tech stack. If someone could come up with an integrated SaaS solution, and be SUPER careful about compliance... they would be printing money.
It's a Sisyphean task. They are, as a rule, extremely anti-technology and conservative. At a previous startup, we had built software which was saving customers many hours a week—yet it was still an uphill battle to get paying approval.
If even after all the warnings in this thread you really want to build legal software, focus on disrupting lawyers instead of selling to them.
In general, law and media are two of the worst fields for technology.
See how well pitching 'do it faster and make less' goes over.
Oh, the clients expect an itemized bill? Simple, the above charges would be "10 legal intern equivalent hours @ $150/hour". If a client questions it, the lawyer can explain that they are now using a very expensive piece of software instead of interns and attorneys for certain tasks, but felt it was an ethical obligation to quote the cost in a human understandable way. Turn the arbitrary pricing into a positive!
And of course your software should be able to quote all its tasks in these legal intern equivalent hours. This also leaves the lawyers hands clean since they can say that the software came up with hourly figure, not them.
Perhaps bringing it back to a development perspective might shine some more light on it for us. Imagine you're a freelance developer and you've now developed (or bought) a fancy piece of software that allows you to do plenty of code-generation and reduce the amount of menial database layer code that needs to be written. You're now say 1.5x more efficient at delivering a product. What are you to do then? I doubt many clients would agree to a once-off fee for usage of your fancy code generation tool, even if you phrase it as saving "4 intern developer hours", and charge appropriately. There is also probably a cap on the hourly rate they're willing to pay you. Either that, or you change to a per-deliverable or product pricing model.
Sometimes change does need to be slow.
It's part of why I encourage everyone I know (particularly developers) to switch from hourly to fixed-price billing. Any efficiencies you gain should belong to you, not the customer. (There's also the fact that I find a lot more people are willing to pay $10k for X than $250/hr for 40 hours.)
You see how that goes. Project pricing leads to a guessing game. Billing hourly is fair for everyone, at least in software. If I am more efficient, I pass that onto the customer. I don't 'lose' money -- it usually results in more work.
Imagine charging $80 for some corn because I want to make the same money as if I had guys hand-picking and hand seeding and doing the entire farming process without machines. That corn only cost me $0.10 to produce but I am charging a price as if I didn't have modern efficiencies. I would sell a lot less corn and actually profit less due to both competition and price elasticity. People would look for alternatives to corn.
In software, not passing on efficiencies means that there would actually be a smaller market for software development. Imagine how bad the market would be for us if we wrote everything in assembly. A simple web site might cost $100m and there'd be exactly 5 people in the world building websites.
I did some fixed price work this summer for a project where I thought the scope was unusually well understood by both sides. About 3 months, 60k USD if done by a fixed deadline (yes - fixed scope, fixed price, fixed deadline!) and as far as I was concerned from the original spec I had it done within about 6 weeks.
Of course, I spent the rest of the project time politely asking the customer to sign it off and doing the odd freebie to try and keep them happy but mostly at home, not working and not wanting to take anything else on in case they turned round and said I'd screwed up somewhere massively.
Perhaps unrelated, but I still haven't been paid for all of it either. Still, if I do eventually get paid it all it will have worked out better than charging per hour.
If a customer demands additional features, you prepare a Change Order and say, "OK, here is how long it will take and how much extra it will cost."
After a while they learn discipline and stop asking for changes half-way (or more) into the project.
Here is another perspective: the vast majority of features I've build as part of Change Orders rarely, if ever, got used. Granted, I make sure all relevant stakeholders are involved in the creation of the initial Scope of Work. That way, there are no late-comers who demand changes/additions.
The problem with hourly billing is it very poorly aligns incentives. It actually discourages efficiency because the easiest way for me to make more money is to take longer.
Also, psychologically, most clients are not comfortable with the vast differences in appropriate pay between developers. Even in the worst case (where scope was poorly defined and/or I estimated poorly), I'm making more now than I ever did with hourly billing.
If you had a monopoly on modern farming, it would absolutely make sense to charge $40 for corn. You'd soak up all the demand (since you're undercutting the $80 hand-harvesters) while still having massive profit margins.
Getting good at scoping is difficult but by no means impossible.
Comapanies in my coworking space switch one after one. One has gone from a ~6500€ yearly bill to ~3500€ (3 employees), while improving reportability.
Non-industrialized accountants are just as necessary as human cashiers: Not. Lawyers are a bit harder to industrialize.
I think contracts (between law firms and their clients) use hours worked because they don't know upfront how complicated cases will be, how long it will take etc. It's not just for "understandable pricing". Your "bill whatever they want" suggestion is basically saying that at the end, the law firm can quote whatever price they want, and the client agrees up front to pay that.
Could you expand on the "media" part?
But here's the real problem for anyone looking to innovate in that space: the customers. Lawyers are as a rule anti-technology, slow to adapt new techniques, and set in their ways. Worse yet, they just bill their clients for their shitty software like Lexis or WestLaw, so they aren't even personally motivated to reduce costs.
Doesn't this take money away from your firm? It is only when firms are competing on cost, time or client recognized quality that they will institute better workflows via software.
From our perspective in the national standards group, we would actually want our associates to just spend more time on value added activities. Instead of wasting time organizing PDFs of exhibits and monkeying around in spreadsheets, we want to them evaluating the relevant legal and technical tax issues. So it's not precisely cost control that is the primary concern, but quality assurance.
No, because we are in fact competing on cost and client-recognized quality, and to a lesser extent time. Plus our fees are driven more by the market than by our actual costs, so if we billed fewer hours, we would simply bill at a higher rate to reach the same expected fee while still maintaining our position in the marketplace. Or if we could reduce our fee, we might be able to win more market share.
The pejorative term in the industry for padding billing with useless busy work is "fee justification," which really shouldn't ever be necessary. Especially in my practice area, because there's always more work that can be done to flesh out our deliverables, which in turn makes them more effective for convincing the IRS (or state equivalent) or an appeals judge. When I say I've cut thousands of dollars of charge hours, we didn't simply stop charging those hours, we allocated them to more useful, value added activities.
Right now, staff spend far too much time inefficiently manipulating data in Excel, manually organizing exhibits, and a variety of other mundane, low cognitive effort tasks (I can't really specify what kinds because that would essentially doxx me). They feel productive, they look productive, and they meet their charge hour goals. And it allows them to procrastinate on the more mentally taxing work, like evaluating the relevant legal and technical tax issues, which in turn detracts from the quality of our service. Our clients aren't paying us to be extra-expensive outsourced spreadsheet monkeys. They're paying us to eliminate uncertainty about complicated legal and tax issues. So freeing up engagement budget and the staff's mental bandwidth to focus on the high value added cognitive services is tremendously useful in improving quality.
And in terms of time, we compete on that in some cases where there's an audit, exam, or appeal deadline and the client came to us late in the game. But that's an edge case and relatively rare. Certainly having a reputation for being quick, efficient, and timely wouldn't hurt our market position, though.
The firm charges their clients on an hourly basis, so they don't really have an incentive to be more efficient.
Logojoy, for instance, is an example of a service that supplants human labor with a single "good-enough" deliverable at a low price, and does so in a fraction of the amount of time. I imagine this would be much more difficult in legal settings, but LegalZoom seems to be alive and kicking, so it must be possible.
To your second paragraph, I would add that it's hard for customers (and lawyers) to figure out what is "good-enough" in the legal setting. I'm a lawyer and there's a lot of stuff you can find on the internet that I personally think is good enough (I would use it in my personal affairs because the risk of the missing edge cases being an actual problem is slim) but I wouldn't be comfortable recommending it as a solution to a client because those missing edge cases are a real malpractice risk.
In the case of a logo, good enough is whatever the client thinks is good enough. In the case of a lot of legal solutions, good enough is often a murky risk/reward calculation based on legal concepts the client may not understand completely.
I still think there's enormous room for improvement, both in helping clients understand the concepts and the risks they're taking, and also in providing better automated solutions.
I'm sure that there are lots of legal consulting companies that do this for people and entities that consume lots of legal services but the real trick is providing it profitably to "unsophisticated" people doing a one time thing.
That last step's a real doozy, though. Startups are a field that thinks "move fast and break stuff" is actually a good idea. That kind of thinking works when you're slinging viral social media and personal productivity services, but it is catastrophic when you try to move into an industry where your customers' lives or livelihoods are on the line.
>The firm charges their clients on an hourly basis, so they don't really have an incentive to be more efficient.
While I agree that the billable hours system reduces the incentive to be more efficient, I don't think it removes it entirely. Otherwise lawyers would still be using typewriters to draft memos. In my experience, removing some of the inefficiencies frees up time and mental bandwidth to focus on activities which actually benefit the client. More time reading cases, researching, evaluating issues. And you can bill for that.
I don't get it. I'm pretty sure they're not hurting for new cases, so they'd make up any losses in fewer hours with more clients.
Note: perhaps my experiences aren't representative of the industry as a whole.
The only "legal tech" that can succeed (in my opinion) is the kind that eliminates the need for lawyers, but then you're up against a different problem: people who think lawyers are magical wizards who can invoke spells to keep lawsuits and regulators at bay. It's really hard to convince many people that they don't need a lawyer, even though lawyers and law firms are almost never accountable for the advice they give.
Your advice is akin to saying "hey you inexperienced coder, write some production ready code but don't test it and when the only time it needs to run, give it a try. Hope you don't screw it up! When there's another coder in the room who can claim 'oh no he meant to set my financial variable 100X not 10x' and can convince the compiler to agree with them"
But... saving time means they have more time to provide more services, accept new clients, and review their documents to decrease mistakes. Is the relationship not apparent in their minds?
Can you explain what you mean? Letter generation etc is still usefull, I don't see what billing has to do with it. They can still charge what they want to.
Quite amazing that it’s still around.
My impression is that the legal industry is most of the reason why WP is still being used.
Workers Comp, Social Security, Family Law, and elder law in general aren't as glamorous. Clients for those services don't have such deep pockets as the other corners of the law do.
It's likely that a good SaaS-based system with embedded knowledge of jurisdictional rules (in the US, federal and state rules) could be successful.
But the sales cycle for a new product? Getting early adopters? Prepare for some pain.
Not familiar with the space, but seems like what you are describing.
I would suggest we make a monthly of these as they provide important insight into industries.
Just imagine how much valuable knowledge and insights get lost every time someone retires.
I got so exited about the thread and it made me realise something very interesting which I turned into an essay. I call it looking for hidden problems underneat obvious solutions.
1. A better system for automation and measurement. Current solutions aren't ideal when it comes to setting up new systems as well as updating and maintaining existing systems. We build several million dollar facilities a month and each one has automation and measurement equipment that has to be individually set up and programmed. Each technician does things a slightly different way, and the end result is a different set of automation and measurement logic at each facility.
2) Fiber optic DATS (distributed acoustic and temperature sensing) data handling and interpretation. This is a fairly new type of technology in which a fiber optic line is installed in the wellbore. The fiber optic line basically acts as a 15,000' strand of thermometers and microphones placed every 3'. The data from one installation is on the order of terabytes per hour. Oil and gas service companies that offer this service don't know how to handle this amount of data. The problem could probably be solved with S3 or something.
3) Drilling optimization. Create a software suite that utilizes ML/AI to help drilling engineers figure out the best way to drill a well is. It's a perfect ML/AI application. Lots and lots of training data available, easily defined input and output parameters, etc. Drilling engineering is full of hard, non-linear problems and humans are just really bad at it. The only way to be good at it is to drill lots and lots of wells and then listen to your gut.
If you are interested in helping us understand the problem and potentially solve it together contact me at firstname.lastname@example.org
I am work with technology that might be useful for this type of application. Thanks!
So to answer your question, any operator requires these services, though most dont know it. A company called Pason is the leading company in the drilling data industry. Their bread and butter is just data measurement and streaming, though they recently have entered the analytics space. Their technology seems pretty promising.
I feel like it'd be really hard to break into this field without some data to play with. Catch-22...
If you have access/desire to share some data like this I'd love to chat more (email in bio). Sounds like an interesting problem.
Every time I change jobs as an H1-B employee, I've to fill in the same ridiculous data with every law firms weird interface. I wish the US Digital Services would focus on streamlining forms and having auto-import from all the data they already have about me (e.g. automatically translate I-94 records to how much time I actually spent in the US, infer my past I-797 records automatically, have a one time education related upload since that obviously never changes). I realize there are certain valid reasons the agencies don't share data, but I find that hard to believe in an era of infinite surveillance, they can't use the surveilled data to at least make my life easier. I can see how the immigration law industry would never allow this, but I can hope.
The green card process is another minefield.
Also for Schengen countries, I've to apply for a visa every time I travel, and they make me list every time I visited the Schengen zone in the past 5 years, fill out the same application form across different countries, and get the same paystubs and letters from employers. Even a tool that could just machine read all the documentation a particular country requires for a specific visa, and just goes and pulls everything that can be pulled (bank statements, pay stubs, fill in travel dates based on the flight ticket emails in my inbox, hotel reservations and so on.) Just make it convenient for me to travel :)
I'd say closer to four to be safe.
Source: am cofounder of SimpleLegal
Unfortunately, there are very few people that understand both computer science and structural engineering.
The software for that kind of modeling is apparently pretty basic, pretty expensive, buggy, etc.
I thought it was shut down largely because much better software simulations were made available; are they still kinda crummy?
Fluid simulation is a very difficult problem to simulate, structures are a lot simpler.
It does not mean that you just can't take a couple of web developers and make it usable, but as the market is small it might pass the price point with a supersonic bang...
They have a sheet with around 30 rows and 150 columns, and they have 100 of these sheets (in a single Excel file). Some parts use formulas, but usually when somebody needs to change something they need to go through every single sheet. The issue is now when they try to add new data Excel won't let them.
I don't even want to know how they share the file or do backups.
Goo thing you're in healthcare...
And that pushes the data back into the same spreadsheet and a relational database with change history.
Is there a queryable relational database?
Where I would invest (if I were Autodesk or their competitor) is in releasing CAD tools for free in exchange for a consent to use the designs/details internally for ML purposes. Would love to contribute if anyone is working on such a product.
The plan is to link CAD to the mathematics and then link the finite element to this as well. The system would also function as a sort of github for engineering where users can find and use functions to do most standard analysis. Email is in my profile if anyone is interested in talking.
I really like the equations and how you only allow to make formally correct equations (including units). Anxious to see how this develops.
(full disclosure: I am co-founder of a Software which tries to achieve the same aims using different concepts: www.valispace.com)
There are places for users to upload and store data as well - datasets.
This will allow me to link documents to CAD. When the math changes, the CAD will change as well. Once that is done, I will add a finite element meshing and solution system to create an engineering platform that essentially does everything.
I like your site. It's nice to see other people addressing these problems. I am also an aerospace guy. I worked on the shuttle for a while and then designed some components for the Orion. Shoot me an email if you want to talk more.
You need computational geometry, computer graphics, and structural engineering expert level domain knowledge to implement anything. You need to create traditional 2D machine/construction design drawings from the 3D models. Then you need to sell it to corporations, whose work, most of all, must be dependable and free of guess work.
You need to know what sort of geometries you can use to model the reinforcements. Then you need to know how to design the system so it can handle very large amounts of geometry.
The worst of all is you need to deal with god awful industry standard formats- DWG, DGN, IFC, Step/Iges and so on. Maybe DWG import and export first.
To have any real chance you need a guy or two who are good with numerical code, someone who is familiar with e.g. Game engines, soemone who knows computer graphics, a structural engineer to tell how he does his job and what the thousand inconsistencies in the field are (this is not a trivial domain like housing or transport), a sales/marketing guy to connect and push the product.
And, like someone else estimated, the potential market is not gigantic - which is kinda funny because we all depend on reinforced concrete but don't need so many engineers for the design work...
These are much more standalone, and don't have many of the issues you listed.
No one I know is using the automated concrete design built into analysis programs like ETABS, Tekla, etc.
The utility of your software tools will be very limited if you are restricting yourself to only member design instead of total structure solutions like ETABS. Why should engineer pay you at all if they can use spreadsheet for free to do what you do with your SaaS?
> No one I know is using the automated concrete design built into analysis programs like ETABS, Tekla, etc.
Not too sure about this because I know quite a lot of people who are using these tools. Any reason why the people you know don't use ETABS or Tekla?
Why do businesses invest in new tech? Why pay for excel when I can use a pen and calculator? The answer is because it makes them more efficient. We have excel sheets to do the same thing, matlab code to do the same thing, and yet here we are paying for these member design tools because they are the most efficient for us. If you save an engineer even a couple of minutes for each element they are designing, you essentially pay for the software.
>Any reason why the people you know don't use ETABS or Tekla?
We do use ETABS extensively for analysis. We don't use it for design. It is foolhardy to trust the automated RC design in these software. That seemed to be the standard of practice around here, but perhaps it is different in other areas of the world.
Do you mind if I ask why? I'm working on a sort of general approach toward designing trustworthy engineering software, and I'm trying to collect as many reasons as possible for "can't trust the software".
Its not a distrust so much as a fundamental flaw. For simple gravity design it works fine, but even then we are using spColumn because its just quicker for us.
We are working since 1.5 years with some engineers on a software to solve this: www.valispace.com
I would be curious to hear from you whether what we are building with a focus on the space-industry also applies to structural engineering.
In the US there are about 281,400 civil engineers . I couldn't find more detailed information on structural engineers.
-Assume about 10% are practicing structural engineers who need to design concrete structures = 28140.
-Assume a company wants 1 license for every 2 engineers = 14070. (I base this off the fact that my company has 6 licenses for 12 engineers, but we may be higher than average)
-Assume we could get 10% market share = 1407 subscribers.
-Assume $1000/subscriber/year = $1,407,000 from the US market
Obviously this isn't a very rigourous analysis.
Single seat licences are not the only revenue model. Once a product gains traction consulting, training and providing VIP helpdesk and bugfixing services factor in as well.
(For what it's worth, I'm doing something similar in the transport planning space. And yes, bridging the gap between that and modern CS is a substantial piece of work.)
My management style is like this: every task/request is numbered, placed in a queue and assigned to a professional.
What I expect from my ticketing system:
- every manager should be able to assign tasks to someone and set the order they must be executed. He needs know what his team is doing and when they finish each task.
- every professional should know what to do and what are the priorities.
- everything is numbered and linked, all communication recorded.
Everything should be well integrated with email (please, don't send me a notification email about an answer and an url, send me the f* answer). If I answer the email, everything goes into the system, I should be able to send commands to the system by email (for example, add a keyword in order to make it a comment instead of answering).
Personally, I think the optimal ticket system would have this data for each ticket:
* A unique, prefixed ticket # (JIRA gets this right)
* An assignee (like an email To:)
* A reporter (like an email From:)
* A one-line summary (like an email Subject:)
* A multi-line body (like an email body, but ideally with markdown)
* Attachments (like email attachments)
* History for edits of all of these (not like email!)
That's it! It really is basically email, but with a unique ID, and editable with history instead of immutable with replies, and a decent UI, perhaps RSS + notifications.
Unfortunately, everybody else seems to think that their ticketing system should embody their vaguely defined and ever-changing workflow, prioritization, approval, and release management system, so they want to be able to add any number of possible statuses, approvals, workflows and and all the rest. Once you add that, you end up with another JIRA or ClearQuest or BugZilla, and the cycle repeats itself.
As is (consequently) friction it creates in changing the workflow as needs change.
Thats just the fundamental and immutable nature of the problem domain.
I'm not associated with them, but I have used them successfully for months at a time (better than most productivity software). The reason is it is well integrated and similar to email.
The recent GitHub updates let you assign multiple people to reviews and such, but I find it's usually better to tag everyone you want to look at something. I don't think assigning something will send a notification.
In a nutshell, I argue that the problem with most ticket systems is that they do not constrain the domain enough, so they wind up having similar problems to email (sifting through a chronologically-ordered pile of text rather than structured, semantically-ordered information).
Your comments make me think the crux of the problem is that people want tickets to be like email and use email to manage them. I'm not sure you can ever overcome the "chronological pile-up" problem if you allow email as a user interface to ticketing.
In fact, my usual approach to dealing with tickets/issues/emails which start to develop this problem is to make my own private copy of the thread and edit it in precisely this manner, though I'm the only one this benefits since it doesn't get sent back upstream.
I still think there is something here though. Stack Overflow replaced message boards, which were basically HTML versions of mailing lists, and part of that was identifying the semantics of question, answer and comment and defining new operators and new expectations for them.
A wiki is a good approach but because it's totally free-form, the user gets stuck doing the work of keeping things hygienic.
JIRA allows you to edit all the properties of a ticket whenever, but it generates such a huge cloud of email notifications in the process, it kind of disincentivises you from using it. And nobody is in the habit of rereading the page to see what is different since last time.
I agree that's partly it, but that seems ok when you're in the thick of discussing a problem/fix. If you're doing a code review or something after a fix has been pushed, you actually want certain messages to stand out to describe resolutions and whatnot.
So like gmail where you can star/mark certain replies as important and those messages would show up at top-level in the ticket, where all other messages are collapsed.
Full disclosure: I'm part of the maintainer staff.
Major feature that allows me to work around any shortcomings in your office: API access to everything and/or database access (preferably direct read/write access, but even if it's just a downloadable .sql.gz it's a huge benefit).
I'm probably not a typical user, though, FWIW.
For my latest startup I went looking for a service desk tool. The key criteria was "feels like email". The moment any alternative required a user signup just to lodge a support request, I ruled it out.
I ended up choosing Groove. I don't recommended it. All ticketing systems suck, this one just sucked the least for my support desk. Groove doesn't extend to other ticket types, and it's nowhere near as flexible or extensible as JIRA, and the mobile experience is horrible. But it does "feels like email" for my customers better than every alternative you care to mention.
That sounds like unnecessary micromanaging. You couldn't possibly have enough detailed knowledge to know the proper order of tasks in all cases. Possibly even most cases.
I agree that communicating the priorities are important, but the boots on the ground have a much better understanding of what they're working with than you do.
I don't know about you, but I despise filling out the same forms over and over again when seeing new healthcare providers. I'd love to start a service modeled after granular smartphone permissions where
(a) I check in at a new office (scan a code, they scan my code, beacon, something like that)
(b) the office then requests x, y, and z information
(c) a push is sent to my phone where I can review the information and approve or disapprove some or all permissions
(d) a final step of either entering my pin at the office, using my thumbprint on my device, or something else.
The key components would be storing the data encrypted at rest, following HIPAA and then some, having a solid auth protocol (keys, jwts, etc).
I think adoption would be helped because the public are already used to permissions like these when installing apps.
The benefits are a lack of paper trail, no one is going to not shred my SSN, my most up to date data is now available, and instead of hosting N apps/databases, I'm storing 1 and can reduce my maintenance, customer support issues because one for all, all for one.
Edit: edited for readability.
I'd suggest something much more low-tech - a website where you can punch in all your details - insurance, allergies, medical history, etc, etc... and then you can print it out (or a subset of it, for different kinds of providers) or generate a PDF that they can copy & paste into their horrible legacy system (an improvement on retyping), or, for those truly at the cutting edge - the kind of electronic transmission you speak of.
I am on board with what you're saying; an escape hatch for non- or semi-adopters. Obviously, printing is a way to go, so maybe on the mobile app, the ability to check each piece of information required then export/email to your preferred destination.
It'd also be interesting to look to make money on conversion i/e replacing, or integrating with, the outdated monsters you're talking about.
Maybe we're not even talking about healthcare anymore, maybe just the ability to piece together PII (personally identifiable information) and deliver it to X.
>>>> on another note
This goes into a topic I've seen posts on recently, and something of interest to me, personal indexing; a better way to throw blobs against the wall and have it indexed for me, leading to a personal Google. I mean, that's already coming, really, between Facebook and Google (especially Google Photos) but currently I see nothing about piecing together information I'd like to share on a professional level.
It's actually a pretty good solution for ad-hoc "working together" with someone (a lawyer / architect / whatever) on a project, where you have lots of files you need to share and refer to during the project.
I wonder if you could stitch together a workflow as a reseller for Google Apps (no clue what their current name is)?
Either way, good suggestion.
Sell the service to the patients for some smallish fee ($5 per month) and then provide the integrations into the various provider systems for free.
Later on you could scale it up to be an add-on to employee benefits or the health plans.
They also offer identity document verification with facial recognition crosscheck. They want to use this to detect visa overstayers for immediate deportation. That now looks like a market with potential.
Agreed. Especially considering they have an ENTIRE department devoted to personnel along with an office at every single unit level above platoon.
I'm sure it's possible to hack together an AHK script, combined with Pulover's macro creator to automate virtually anything repetitive on a Windows PC, or use Selenium to automate browser actions. Of course then you run the risk of having to fall into the classic XKCD automation time sink.
Edit to add: https://en.wikipedia.org/wiki/Carte_Vitale
I would love to hear from anyone else with big ideas relating to or are working on driving outcomes towards holistic wellness with patient-center healthcare, patient data collection/quantified self, and patient-powered research networks. In the bigger picture, I am passionate about making the world a better place through innovation and working on what really matters for humanity.
When you go to the doctor for a sinus infection, the cost of this is not fixed, even across insurance companies. The other factor is the "level" of service. (ref: http://medicaleconomics.modernmedicine.com/medical-economics...)
The more in-depth the examination and the more time you spend with the patient, the more they can charge. All those forms are "taking family history", etc. and it is free money since you have to do the work. Those are then scanned so they can be used later in an audit.
(Source: I also worked at a start-up that was trying to disrupt out patient medical systems. It's very hard and has lots of roadblocks. btw, of the top 50 EMRs in the US, only 3 have APIs and these are mostly to pull data, not push it back in).
"So what happens when the Douglasses are no longer around? We have every reason to believe that we'll be around for a good long time, but we wanted a plan to provide for our loyal customers just in case we aren't so lucky. So we made one.
Here is how the plan works. Immediately upon learning of our deaths the executors of Dick's estate (his two highly computer literate kids) will post two files on our www.compmngr.com web site and will send out a broadcast email advising our customers how to download the files. The first file is a small standalone computer program called RegisterEvent.exe, which allows you to create your own registration files. So you won't have to register with Douglass Associates and you won't have to pay a registration fee. You can read more about RegisterEvent and how to use it below. The second file is a ZIP file containing all the source code for COMPMNGR and its supporting programs. This file will only be of interest to those few users who want to continue COMPMNGR development and who either know C++ programming or or willing to hire a C++ programmer."
Would it? See, here's a dirty little secret: people can't deal with change.
Any change made to the software means people have to learn something new. And that results in tech support.
I once had a very nice chat with the CEO of a CNC company and asked him why certain features weren't implemented since his hardware was clearly capable of it. He was quite blunt that a single new feature added about 30% to his tech support budget for almost 3 years, and his tech support budget was almost 1/3 of his annual budget.
So, he simply will not add a feature until it results in an expected 500K in increased revenue or he has to fend off a competitor.
> Still, it's a real opportunity.
Is it? Actually?
And do you know ballroom competitions well enough to get all the corner cases correct? The Douglasses have been to a LOT of competitions and probably wrote this because they got tired of the grief caused by badly run competitions.
How many ballroom competitions exist (<1000)? How much are they willing to pay (<$1000)? And how much will tech support cost?
So, this is less that $1,000,000 per year in revenue MAX. And, this software is already in place with people know how to use it.
Your revenue will likely be $10-20K per year for a long while unless you completely displace this. And they can always drop their prices and block you out if they feel like it. And your tech support costs will be quite high.
I suspect the Douglasses made this same calculation and that's why they aren't improving it. It's just not worth the money.
See http://www.topturnier.de/ for what he is doing.
If one would like to do something in this space I'd go with a solution where you can rent the equipment, get it shipped to you in boxes and ship it back later. For larger organisers you could arrange for leasing options or an on-premise installation that has an auto-update.
The advantage would be to provide offline capabilities including a controlled network environment for adjudicators.
It looks like it's offered for free now. The thing with this kind of software, is it must be "good enough" for the task.
And here they ask for credit card details over http...
Geez. Small world.
Think somewhere on the order of 10,000 models per day throughput.
There's $BNs waiting for you. It's ridiculously hard.
I assume that process would be easy to speed up if the requirement for absolute accuracy was removed. The 8' ROMER arm we use is accurate to ~!2 microns over its entire volume which is absolutely overkill for something intended to produce models for visual arts applications. A quick and dirty approach to generating the mesh might increase the inaccuracy by several orders of magnitude but when coke can has dimensional tolerances to the tune of tenths of a millimeter, the quick and dirty mesh will still be representative of the end product.
Who would be the primary customers? The entire 3D capturing market is currently several $B per year, including services. Where would be the customers that aren't getting served today that would double this market?
That said, some people have tried to use RS for this problem, but from what I've seen end up just using Kinects.
But yea, there are a lot of us working on that.
10K scans/day is way beyond their limits, and I'm not sure that's a very common use case. But I bet they could get there if they wanted.
I was musing another kind of 'real world capture' with videogames, because I want to race around my neighborhood in Forza. https://hackernoon.com/dashcam-google-maps-dev-kit-custom-ne...
Maybe but I actually think that's the wrong approach.
I mean, to me it's kind of hard to believe nobody's tried making a "conveyor belt" like process inside a closed system
Yea they have - kinda. None of it works well or fast enough though. We put up a patent for one a year ago before I thought there was a better way to do it. The manpower required to move items onto/off of a line is a big part of the problem.
Taking that 10k number - assuming disparate types of items that might be part of a series like "Bathroom" (toothbrush, hair brush, toilet brush, plunger) - in 24 hours that means cycling each item through in about 8 seconds. The only way I remotely see that possible is essentially having a robot hand pick up the item at the entry point, hold it for the capture sequence (perhaps have a custom-designed 'mount' that can allow for true 360 via a couple positions), and then drop it out the other side.
It's the scale part I'm wondering about, re: one size machine fits all doesn't seem to make sense. One machine for items under a certain dimension (e.g. "hand held") then another for items where the machine has to essentially have super-powers to pick up and rotate objects to complete the imaging process (e.g. a couch, a dresser, a motorcycle, etc). I think trying too hard to accommodate outliers ends up tainting the balance of operations a little? Just thinking out loud, really cool puzzle.
IMO it should be done with a mixture image segmentation and procedural generation.
Stereo structured light is great, but doesn't work on specularly reflective objects. You've seen those amazing depth maps from the guys at Middlebury? Wonder how they get perfect ground truth on motorbike cowls that are essentially mirrors? Well they have to spray paint them grey so that you can see the light. The next problem is that you're limited by the resolution of the projector (so I guess if you own a cinema, yay!) and the cameras. Then you have to do all the inter-image code matching which sounds trivial in the papers, but in practice a lot harder (and since you don't get codes at all pixels you need to interpolate, etc, etc).
There are handheld scanners like the Creaform which work pretty well on small things, but I don't know what the accuracy is like.
The ultimate system would probably be a high-resolution, high-accuracy, scanned LIDAR system. Then you lose the problems with scanning ranges/depth of field, but you accept massively higher cost and possibly a much longer scan time for accurate systems.
That's been turned into an industry with very high throughput.
And inside the object as well?
I'm not sure what you're asking here. Are you asking if it should be better than what can be done with photogrammetry?
Doing just the outside is a big enough market/problem.
Not sure what kind of datasets you're looking for. You'll see actual products to test with.
I've been studying and working with GANs for about a year now. They are still very exciting, and I'd love to try to expand my codebase to new types of data.
Additionally, there are some recent techniques that haven't been tried with voxel-based renderings.
Perhaps there is another algorithm that can help go from voxel -> polygons as well.
I think with the right tech, time, and execution this could be a matter of:
1. Take a picture
2. Generate until you get the 3d model you want
Well, not exact cause I don't like their voxel building generation method.
I think a GAN + Procedural Generator is the winner.
edit: Let me know if you want to work on this cause it's an active area of research for us. See my HN profile for contact.
Curious to know more about your train of thought. I am working as a researcher in the domain and thinking of experimenting with GANs for 3D model estimation using similar inputs as the one in the paper I referred to.