Cloud

Write an awesome doc for cloud and how to deploy our apps using technologies like AWS, GitHub actions, CI/CD and Docker. Also a very comprehensive documentation around AWS.

View on GitHub

Cloud integration

Our apps need to talk to each other. And we have two type of communication

Name AKA Description Pros Cons Used in Visialization
Synchronous communication They talk directly to each other.
  • Simplicity.
  • Easier to Implement.
  • Immediate Consistency.
  • Tight Coupling.
  • Blocking Behavior.
  • Scalability Issues.
  • RESTful APIs.
  • Client-server models.
  • RPC (Remote Procedure Call) systems.
Asynchronous communication Event based communication In this method applications talk to each other through a message queue.
  • Decoupling.
  • Scalability.
  • Performance.
  • Resilience and Fault Tolerance.
  • Complexity.
  • Eventual Consistency.
  • Difficulty in Error Handling.
  • Event-driven architectures.
  • Microservices with message brokers.
  • Microservices with publish/subscribe models.

Choosing between async and sync communication depends on factors like:

Often, systems might use a combination of both, leveraging synchronous communication where immediate responses and simplicity are crucial, and asynchronous communication for operations that can tolerate delays or require higher scalability and fault tolerance. Now we can talk about AWS services which enable us to implement async architecture.

SQS

SQS high level visualization

Video processing example

Assume you have a group of replicated RESTful APIs which receives http requests from your client-side applications (e.g. mobile apps, SPAs). And now you need to process those videos. So you could do it in your RESTful API backend (assume it is written in NodeJS). But then you might face some difficulties such as performance issues, tightly coupled, cost-inefficient backend.

So in these cases we can have a separate service which takes care of video processing for us and our NodeJS app simply push a new message into the message queue (SQS) and our other service will consume them.

SQS video processing example

FIFO queue feature

SQS FIFO VS normal SQS

Creating a SQS

When you’re creating a new SQS you need to specify:

  1. Visibility timeout:

    • The time a message is hidden from other consumers after being received by one consumer.
    • If a consumer doesn’t process and delete the message before the visibility timeout expires, the message becomes visible again to other consumers.
    • To ensure a message is only processed once, it must be deleted before the visibility timeout ends.
    • The default visibility timeout is 30 seconds for all messages in the queue.
    • Should be set longer than the maximum time needed to process and delete a message.
    • Set the visibility timeout longer than the AWS SDK read timeout, whether using short or long polling with the ReceiveMessage API.

      • AWS SDK read timeout: The maximum time the AWS SDK will wait for a response when making a call to receive messages.
      • Polling mechanisms:

        Name Default Description Upsides Downsides Use cases
        Short polling. sqs.receiveMessage queries a subset of SQS servers to find available messages and sends an immediate response.
        • Simple Implementation.
        • Immediate Response.
        • Higher request rates.
        • Increased Server Load.
        • Suitable for big queues where we can assume always to have messages available.
        • Single thread apps in which they need to poll multiple queues, long polling may not work as the main thread will be blocked and messages from other queues cannot be polled.
        Long polling. sqs.receiveMessage queries all SQS servers for messages, sending a response once at least one message is available.
        • Reduce the cost by eliminating the number of empty responses (WaitTimeSeconds).
        • Reduce false empty responses by querying all (rather than a subset of) SQS servers.
        • Return messages as soon as they become available.
        None that I can remember of. Although you might have a tougher time to implement it:
        • multi-threading
        • configuring HTTP to not close the connection when your WaitTimeSeconds is 20 seconds
        But I guess that's not too hard either.
        • For processing messages as soon as they become visible.
        • Long polling is almost always preferable to short polling

    Visibility timeout infographic

  2. Delivery Delay:

    • This is the time new messages are hidden before consumers can see them.
    • Useful to coordinate with external systems:
      • Delays can be used to align message processing with external system readiness.
      • E.g.:
        • Your service saves the file in a S3 bucket and you’re replicating them across regions. Now we know it might take some time for our newly added object to be available everywhere. As for the post-upload processes, done via other consumers we might need some time so that the object is replicated into different regions/AZs.
        • You send a message and need two minutes to ensure that data is present in the database which is inserted by another process in a parallel manner. In this case SQS waits for 2 minutes or the time which your service takes to complete the insert process before making the message visible in the queue.
    • You can alter the default DelaySeconds when invoking sqs.sendMessage.

    Delivery delay infographic

  3. Receive message wait time:

    • The amount of time a sqs.receiveMessage in a long polling mechanism waits for messages.
    • Min: 0 second.
    • Max: 20 seconds.
  4. Message retention period:

    • For how long SQS should keep messages in the queue when they have not been deleted.
    • Min: 4 days.
    • Max: 14 days.
  5. Maximum message size:

    • Min: 1 byte (1 character).
    • Max: 256 KB.

SNS

Kinesis

Kinesis Video Streams

Kinesis video stream infographic

Open protocols

SQS, SNS, and their friends are proprietary used in AWS. But we might need something a bit more open and accessible, protocols like:

MQ

SQS VS RabbitMQ

In General you need to consider these topic when choosing between them:

  1. Are you migrating to the cloud? if yes, you might wanna stick to your open protocol since you do not need to refactor your app. At least for the time being.
  2. Costs.
  3. Speed.
  4. Scalability.
  5. Maintenance.