Thursday 31 October 2019

[beginner] forcing asynchronous functions to execute synchronously

Hey all! (TLDR at the bottom if you don’t care for context)So I’m not new to programming, but I’m definitely new to node. My background is in languages like C# and Java, just for context; basically nothing web-based.Recently to help teach myself the language, I’m putting together a hobby app. The premise isn’t important for the moment, but I’ll be doing a whole lot of interaction with a database full of information.It’s a pre-existing dataset that I’ll be working with, and it has about 26,000 records in it. The issue is that before I can actually use that data for what I want, I need to take that raw data and combine it into a wholly different data format, which will group a lot of those records together and do some math on them. To help visualize, it’s sports data, and the initial raw data I have is every historical season individually - so a player, a year, a team, and stats for that year. So one player could have many different seasons. I want to create a new data set organized by player, and each player document will have an array underneath it with all of their associated seasons.So the organization and transformation piece isn’t so bad. I’ve successfully gotten it to take in a season, validate whether or not it already has a player created, and then appropriately either create a player, create a season under a player, or nothing for duplicates.The problem comes in now running that for the entire DB of 26000 records. You can assume what happened: I query for all records, send that array to a function, and then attempted to just do a foreach over the array and call my converter function for each case.And of course, that didn’t work. It sure did call the function about 20000 times, but I think I functionally overloaded my DB with requests, and finally mongo just force-closed the connections, and it didn’t finish. I configured mongoose to allow me 20 open connections, rather than 5; I also upped the timeout to 10 minutes. Got more stuff done, but ultimately the same thing happened.Moreover, since I do kinda need to do this in order (so that player objects can be created, and then latter objects can come in and be grouped under the first object), I’d like a way to do this synchronously, or at the very least, do it in small batches with a configurable delay. I don’t actually care if it takes 5 hours or something - this is just a one-time process to get my data in place the first time, so I can actually do a cool thing with it.So TL;DR - how do take asynchronous actions (I.e, interacting with MongoDB) and make them synchronous so they execute in sequence? I’m using mongoose for schema and models.

Submitted November 01, 2019 at 05:04AM by Dreadmaker

No comments:

Post a Comment