Started masters program for comp sci, hoping to be a data scientist.
The class I’m currently taking has me working in scala, pytorch, mlib, hadoop, hive, pig, and pandas.
I feel like I’m barely surviving. I thought I liked the idea of being a data scientist because I enjoy coding and doing analytics at my job, but this is really something else and if this is what I can expect for the rest of the program + my future job, I’m really having doubts that this is for me.
I guess I should just be an analytst?
Is there a version of data scientist that exists that doesn’t require intense amounts of calculus and linear algebra? How about just importing ML libraries in python and calling the pre-built functions, tuning parameters, and then examining the results without all of the math nonsense?
Is the term for that stupid data scientist? Data engineer? Someone with experience please chime in.
keep in mind that 4-5 years ago, to accomplish the same tasks would require in-depth knowledge about “ML” and you’d most certainly have to fnagle with an underlying codebase. now it’s 30~ lines of python a CS sophomore could do.
you should brand yourself as a “data engineer” and learn skills to do the following: sanitize, order, and store datasets
When you are computing in the cloud, like me (i store all my macros in Dynamo DB) you do not have to think about writing efficient, clean code. You can just add resources until everything works for the low, low price of 4¢ + a quick blowie in the wash closet for jeff b