
Ask HN: Help my data analyst team not suck - splatt
I was just promoted to lead a team of 10 data analysts at a behind-the-times insurance co.<p>I&#x27;m in over my head, so this is a peter principle event, but want to give it a shot before suggesting the VP fire me and hire someone competent.<p>The team works on a variety of reporting and analysis projects, and everyone uses their own preferred tools (SAS, SQL, Python, Excel, Access).<p>The team was formed by taking &quot;data people&quot; out of many departments and moving to a centralized &quot;analytics&quot; team. The departments are now our customers.<p>The company also purchased a Qlikview server, licenses, and dedicated two people on the team to maintaining the environment and developing dashboards.<p>I have free reign to organize the team as I see fit.
I&#x27;d like to move to working in a common toolset, using a project management tool, having standards&#x2F;documentation, using source control, etc.<p>I&#x27;ve spent most of my career in a similar role to the other analysts on my team so don&#x27;t have a ton of experience in using software development methods like this myself. I know enough to think we could benefit by them. I don&#x27;t want to introduce unnecessary bureaucracy, but I am willing to add ramp up time to get people working better in the long run.<p>All of our data is stored in a poorly organized oracle warehouse maintained by a slow moving IT team.<p>The analysts independently do a lot of transformation and cleanup of data from the warehouse before being able to use it in their work.<p>I hope to build a pipeline where we treat the IT warehouse as our source, write an automated ETL process to clean, organize and enrich data before loading into our own database that would serve as the source for our qlikview dashboards, and other reports.<p>Wondering if you all could offer some advice for pulling this off.<p>What&#x27;s the best way to introduce some standards and process in a project like this?<p>Possible? Or hopeless and I should commit seppuku?<p>Thanks!
======
itronitron
Sounds like fun, you should probably look at Jupyter or Spark as a system to
manage the data transformations and that allows team members to create and
share scripts and workbooks.

Develop the ETL process so that it just pulls data and writes it to your
team's ideal form as flat files, then write another process that pushes that
data where and how you want it (because that can change in six months), also
develop automated processes for measuring/ensuring the quality of data being
added to your system.

~~~
splatt
Not familiar with spark beyond being aware of it. We have a good amount of
space allocated on a fairly powerful oracle server. Was thinking to store all
of our tables there. If there are big advantages over an alternate system, I
could get it done but would have to get the IT software team on board with
letting me install it, getting a server, etc.

Jupyter have used but was thinking to have the team settle on mainly using SAS
code to built the ETL process since that is the language most of them are
familiar with using. (even though I personally HATE writing SAS)

~~~
itronitron
if you want to test out spark, then a trial account on databricks.com is
probably a good place to start. If the team is used to SAS though I'd stick
with that.

------
auslegung
Passing your vision and dream to everyone is your most powerful move. Imagine
for a moment what that would be like, what that would accomplish. This book is
an amazing read for people in your position, [https://www.amazon.com/Switch-
Change-Things-When-Hard/dp/038...](https://www.amazon.com/Switch-Change-
Things-When-Hard/dp/0385528752)

~~~
splatt
Thanks, I'll check it out. Getting buy in from this group will be a challenge,
but it has been positive so far.

------
DoreenMichele
FWIW:

When I was at an insurance company, I kept suggesting they introduce GIS
because so much of the data is location-based. They kept blowing me off.

That's all I've got.

