[Front/Back] CDN #63

its-colby · 2024-10-22T17:57:34Z

In order to horizontally scale our sole file cache server, we need to do a few things.

There are two approaches. We can build our own CDN, or use CDN infrastructure.

I recommend we use CDN infrastructre, provided by one of the many providers, Google, Amazon, etc. This would likely mean needing to separate out business logic and network requests into cloud lambda functions and gateways. An example below:

. Amazon CloudFront Setup

CloudFront Distribution: Create a CloudFront distribution to serve as your CDN. This distribution will handle incoming requests and serve cached data.
Caching Policy: Set appropriate caching policies that define how frequently your edge locations should check for updated content from your origin servers, which in this case will include both static and dynamically fetched blockchain data.
2. AWS Lambda@Edge

Request Interceptor: Use Lambda@Edge to intercept requests at the CDN. This function will:
Check the user’s API key or token included in the request headers.
Call an API Gateway endpoint to verify if the user has sufficient credits for the request.
If credits are sufficient, proceed to serve the request. If not, return an appropriate error message.
Response Interceptor: Optionally, you can also intercept responses to log usage or modify responses before they are sent back to the user.
3. AWS API Gateway and AWS Lambda for Credit Management

API Gateway: Set up an API Gateway to manage incoming requests for credit checks and blockchain queries. This serves as the centralized point for all API calls.
Lambda Functions: Create several Lambda functions to handle different tasks:
Credit Check Function: This function verifies if the user has enough credits to perform the operation. It interacts with DynamoDB to check the user's remaining credit balance.
Blockchain Query Function: If CloudFront and Lambda@Edge determine that the cached data is stale or missing, this Lambda function is triggered to fetch the latest data from the blockchain. The complexity of the logic for querying the blockchain depends on your specific blockchain implementation and requirements.
4. Amazon DynamoDB

Database Schema:
Users Table: Store user identification, API keys, and other authentication details.
Credits Table: Maintain credit balances, including data on credits used for uploads and downloads.
Operations: Ensure that the database can handle high read/write throughput for credit checks and updates, possibly using DynamoDB Accelerator (DAX) if latency is a critical factor.
5. Security Setup with AWS IAM

Roles and Policies: Create IAM roles and policies that strictly define who can access the Lambda functions, DynamoDB tables, and API Gateway endpoints. Ensure that Lambda@Edge functions have the necessary permissions to invoke API Gateway.
API Security: Secure API Gateway endpoints using API keys or OAuth tokens to ensure that only authenticated users can make requests.

Depending on what we decide with files gateway and object indexer, that logic can be put in separate dynamo DBs, gateways, and lambdas, or included in the same as above.

The following approach would be building our own (not recommended).

First, we need to use GeoDNS (simpler) or Anycast, via services like Cloudfare, Amazon, etc to allow DNS routing based upon geographic location. The developer should use whatever technologies they are most comfortable with. In other words, this approach shouldn't need a load balancer (although GeoDNS might be doing this behind the scenes, not sure).

Second, on the frontend, we should have a message stating that constantly switching locations via a VPN will result in slower site performance, as we use a geographic cache. This warning message should be displayed on account creation. But, in the future, it could be dynamicly generated based on recognition of a particular client switching locations (managed through device fingerprinting).

Third, we do not care about duplication of content (cache entries) across servers. In other words, it is ok if image.jpeg is stored on two servers. It is not a distributed database. HOWEVER, we must maintain the global state of our credit management system. In other words, across servers, we must keep track of the upload/download/pinning usage of all clients.

To implement the global credit management state, we will separate it from the CDN and put it on its own server. Servers within the CDN will query the global credit server b4 allowing upload/downloads/pins. Similarly, it will post atomic updates to the global credit server.

In the future, if the global credit server is a bottleneck, there are multiple ways to scale it. But, the simplest and best would be to distribute it and just use a distributed database service that abstracts all the complexities away from us.

its-colby added low priority mvp Minimum viable product labels Oct 22, 2024

its-colby added this to the Auto Drive MVP milestone Oct 22, 2024

clostao closed this as completed Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Front/Back] CDN #63

[Front/Back] CDN #63

its-colby commented Oct 22, 2024 •

edited

Loading

[Front/Back] CDN #63

[Front/Back] CDN #63

Comments

its-colby commented Oct 22, 2024 • edited Loading

its-colby commented Oct 22, 2024 •

edited

Loading