Tuesday 25 June 2019

Node performance questions

Hi all,Intermediate node dev here, have a few questions about debugging/measuring performance of nodejs apps.Some general questions:Is there an option or flag I to see what the garbage collector is doing or where/when memory allocation is happening, gc pauses etc.Is there a way for me to print out the assembly the v8 compiler generates?To time the code I can just use a timer and run my function or app in the timer 10k times or something, I guess, is there a better way to do this?More specific ones:​(Hypothetical question) Say I have a .csv file with 100k lines. I parse each line in a loop and create an object with a key for each column in the .csv and a value for each row entry in that column.Now i am making 100k objects in a short amount of time.Say my csv header looks likeid, name, location, email, address, phone, number, tags My 100k objects would look something like{ id: UUID name: "Frank N. Stein", location: {lat: 3, long: 1415}, email: "whatmeworry@gmail.com", address: "123 Elm St.", phone: "123 123 1234", number: 1, tags: ["blah", "other tag", "another tag"]} ​They keys would all be the same but the values are different for all the 100k rowsIf I create a new object for every row, I would be creating 100k objects, what happens if I use an object template like:{ id: UUID name: "Frank N. Stein", location: {lat: 3, long: 1415}, email: "whatmeworry@gmail.com", address: "123 Elm St.", phone: "123 123 1234", number: 1, tags: ["blah", "other tag", "another tag"] } and for each row modify the values liketemplate.id = "fdfdf" template.name = "ur mom" then write the template to a file before processing the next row.​Will v8 allocate one object and change the pointers for the new values or will it copy the original? And how can I learn about how this works and see it in action ?​I am streaming a file larger than the available memory on my machine :For each chunk of data that I stream I make an api call and add it to a promises array.I set a batch size of a 100 for the promise array and once I get to a 100 I await all the api calls.(I don't know how many open sockets I have on the open pool so I don't know how many requests are in flight at the same time and how many are queued. Is there a way to check this? I just chose 100 as a nice round number for this example.)Now my code will pause during 2 cases:When I am awaiting for the resolution of all the promises ( I don't know if the stream keeps reading during this time, I don't think so because my await Promise.all() is inside my stream's on.data event handler.When the stream pauses to let the consumer consume more data on the stream because of backpressure.​My whole stream is wrapped in a new Promise() which resolves when the stream.end's or rejects if the stream.errors.Say I want to process 100 such stream in parallel? And i just do something lilke(async () => { const promises = {} files.forEach(file => { promises.push(streamFile(file)) } await Proimse.all(promises) } )() ​What's happening? Is node processing each stream in sequence, is is processing multiple streams at the same time (some chunks from stream 1, some chunks from stream 2? ) How does this work?​Am I thinking about this stuff correctly? How do I see whats going on "under the hood"?​Any resources that have any of this info would help me a lot I just couldn't find anything.​Sorry for the long ass question, thanks!

Submitted June 26, 2019 at 02:57AM by whileAlive_doStuff

No comments:

Post a Comment