My parents don't spend the summers in Alabama dude. It's 96 degrees here.
yeah your mom spends summers out in California with me I already know this
The worst part about moms house RN is that I think the air conditioning on the 1st floor is busted and it gets up to like 80 degrees during the later part of the day.
why don't you fix it or pay someone to come fix it
second floor air conditioning is fine.
My ex-girlfriend's apartment was too small. I need room to roam. At least 4000 sqft.
I don't think your moms gonna be very happy when she comes back and the AC is broken
I still have a few more months and by then it'll be cool.
I forget your freakish wingspan when you stretch out probably takes up the average length of a 1200 soft apartment living room
I don't know how people do it. I constantly have to pace the deer trails of the State Park that backs Mom's property. 1200sqft is insane.
ex girlfriend? I thought that was nyte?
I've hit the age where professional women at age 30 freak out and date you.
did you lose your virginity???
woah this is astounding did I miss this or something
I was dating a D-list celebrity actually.
but did you lose your virginity that's all I care about
No.
sorry to hear that
Damn that is a nice problem
You probably need a fine-tuned/custom LLM build to understand the text. Idk you have to evaluate whether models trained on the internet actually have a sufficient training data set to be able to work with just the text parts of these docs and how they perform against various custom options. That is project 1 and potentially 10s of thousands in budget depending how thoroughly you want to investigate your custom options. That's paper 1 right there (if your goal is writing gay ass academic shit that nobody reads in order to get some PhD and go work at Boeing)
Item 2 is the approach to the images/graphs. I know there are tools for this but have no experience there. But you want your RAG agent to understand when it's working with text vs graph and potentially use an agent graph for the different media (could even use different agents for the different types of text. I would build something where the different agents could use different models so you have a fine-tuned model specifically for a doc you know is full formal language vs maybe generic LLM call for more basic text)
All this is just to get your LLM to a point where it can actually process the docs for storage pre-RAG. Then there is a layer of analysis (kind of speaks to your last point) on what that storage should look like to actually be useful to someone. Here you need to talk to people (preferably find one really smart guy with a vision and a ton of domain-specific experience) to understand what they actually want to pull out of the data. There is the very basic "Find me this document on topic ABC" or "Help me understand have we already solved XYZ problem" but there are also most likely hidden relationships/insights to find when you put 60 years of research (which probably sits in PDF and xlsx form on some fileserver somewhere) into a system that can actually parse it. For this you start to look at stuff like RAG in Graph DBs and then you have your chat assistant or whatever responds to queries with a set of tools that go to RAG-traditional, Graph RAG, etc etc and have to help it determine how to best route the prompts to the best one of your 14 different RAG approaches for the same data set
This is me as a person who definitely has less knowledge of the landscape than you opining just for fun
Also you aren't at Lockheed are you?
I think I know the guy who is in charge of a huge effort doing basically this at Lockheed