
Amazon’s Simple Queuing Service is a great and easy way to establish communication between two systems or applications. The idea is simple: one application publishes a message to the queue, which the other application can then receive.
Problems can easily arise when we attempt to debug code that sends and/or receives messages from active SQS queues. These problems become compounded when multiple developers on a team need to debug code targeting said queues.
This article demonstrates techniques that can be applied to your code and cloud infrastructure, allowing a team of developers to simultaneously work on and debug code targeting one or more SQS queues while avoiding all the headaches.
The Problem
Assume we have a cloud environment with an SQS queue named input
. When a message is written to the input
queue, it is processed and then submitted to an external system.
The input
queue has an event source mapping to a Lambda function named submit-input
. This Lambda function, unsurprisingly, processes the messages received by executing any required business logic and then submits it to the external system.

Now, let’s say we want to locally debug the code normally executed by submit-input
by polling the input
queue for new messages. We would quickly find out that this is, at best, very difficult and, at worst, not possible unless we disable the event source mapping.
The First Consumer Wins
When a client receives an SQS message, that message is considered to be in flight, preventing any other client from receiving it. Once the message is successfully processed, it is removed from the queue.
If the client indicates that an error occurred while processing the message, it will be returned to the queue for others to receive. However, in the case of an event source mapping (which will usually win the race for receiving the message), it will likely pick up the same message to process again immediately.
This means that we will have a very hard time locally debugging the processing of messages submitted to the input
queue. We would need to disable the event source mapping to the submit-input
Lambda function, which is a huge pain.
More Developers, More Problems
This problem becomes aggravated when we add more developers to the mix, even with the event source mapping disabled.
Suppose two developers are attempting to locally debug the input submission code, such that both developers are polling the input
queue. In that case, there’s no guarantee that Developer A’s desired input message won’t get intercepted by Developer B and vice versa.
Of course, it would be trivial to simply mock input data and keep everything 100% local; however, it is beneficial to be able to debug the overall live process, especially when we’re dealing with a more complex process, such as one where the big unknown is the output of another system or application.

The problem is simple: multiple developers means multiple consumers and SQS queues don’t support that. So, good luck working on code that consumes SQS queues if you’re not developing solo.
Luckily, we can support all the use cases discussed above by rerouting messages to sender-specific queues. How do we do this? Well, that’s what we’ll look at next…
Rerouting Messages to Sender-Specific Queues
Continuing with our example architecture, we can modify the event source mapped submit-input
Lambda function to reroute messages originating from specific senders to distinct queues, which those senders will poll instead of the core input
queue.
We will create a service class so we can incorporate this functionality across any number of Lambda functions. However, before we delve too deeply into that, we need to determine how to identify the message sender.
Identifying the Message Sender/Target
In the case of our first example, we can simply include the sender identification as a message attribute. But how do we identify the sender to target?
Developers using Windows can be uniquely identified and therefore targeted by using their local machine GUID.
We will want messages sent by live application code (and not developers trying to debug something) to be handled by the normal submit-input
process. For these messages, the designated target will be the Lambda itself, which will have its own GUID generated by Terraform during deployment.
lambdas.tf
resource "random_uuid" "input" {}
resource "aws_lambda_function" "submit_input" {
function_name = "submit-input"
role = aws_iam_role.lambda.arn
# Insert other Lambda function boilerplate code...
environment {
variables = {
LAMBDA_GUID = random_uuid.input.result
}
}
}
The Lambda function itself can then read this environment variable to confirm if it is the intended target, and other Lambda functions or code can access this unique ID during normal operation to target the submit-input
Lambda function either via the Parameter Store or by having it shared and added to their own environment variables.
For getting a developer’s machine GUID, we can just query the registry:
Querying for the Machine GUID
using RegistryKey key = Registry.LocalMachine.OpenSubKey(@"SOFTWARE\Microsoft\Cryptography"); return key?.GetValue("MachineGuid")?.ToString() ?? Guid.NewGuid().ToString();
A more complete implementation will be in the Implemented Solution section below.
Making a Reroutable Queue Service
We may wish to implement queue rerouting functionality across multiple Lambda functions. To that end, we’ll create a general service class we can use in the Lambda functions where we want to add this capability.
To summarize its functionality, it collates all messages with non-Lambda SenderId
values and then sends them to a queue whose name is appended with the corresponding SenderId
. The clients themselves are responsible for creating the queues, as you’ll see in the next session.
Also, this is designed to work with FIFO queues, so adjust accordingly if you’re just using standard queues.
ReroutableQueueService.cs
/// <summary> /// Provides a service for processing a reroutable queue. /// </summary> public sealed partial class ReroutableQueueService { private readonly ILogger<ReroutableQueueService> _logger; private readonly IAmazonSQS _sqsClient; public ReroutableQueueService(IAmazonSQS sqsClient, ILogger<ReroutableQueueService> logger) { _sqsClient = sqsClient; _logger = logger; } /// <summary> /// Processes queue messages in the provided event payload. /// </summary> /// <param name="sqsEvent">SQS event payload containing messages.</param> /// <param name="processMessage"> /// Processing method executed on eligible, non-rerouted messages. /// </param> /// <returns>Response containing any SQS messages that could not be processed.</returns> public async Task<SQSBatchResponse> ProcessMessagesAsync( SQSEvent sqsEvent, Func<SQSEvent.SQSMessage, Task<bool>> processMessage) { Require.NotNull(sqsEvent); List<SQSBatchResponse.BatchItemFailure> itemFailures = []; Dictionary<string, List<SQSEvent.SQSMessage>> targetQueueMessagesMap = []; _logger.MessagesBeingProcessed(sqsEvent.Records.Count); foreach (var record in sqsEvent.Records) { try { string messageSenderId = record.MessageAttributes["SenderId"].StringValue; if (messageSenderId != Environment.GetEnvironmentVariable("LAMBDA_GUID")) { string sourceQueue = ArnResourceId().Match(record.EventSourceArn).Value; string targetQueue = sourceQueue.Replace(".fifo", $"-{messageSenderId}.fifo", StringComparison.OrdinalIgnoreCase); if (!targetQueueMessagesMap.TryGetValue(targetQueue, out var messages)) { messages = []; targetQueueMessagesMap.Add(targetQueue, messages); } messages.Add(record); continue; } bool success = await processMessage(record); if (!success) { itemFailures.Add(new SQSBatchResponse.BatchItemFailure { ItemIdentifier = record.MessageId }); } } catch (Exception) { _logger.MessageProcessingFailure(record.MessageId); itemFailures.Add(new SQSBatchResponse.BatchItemFailure { ItemIdentifier = record.MessageId }); } } foreach (string targetQueue in targetQueueMessagesMap.Keys) { await RerouteMessagesAsync(targetQueue, targetQueueMessagesMap[targetQueue]); } return new SQSBatchResponse(itemFailures); } private async Task RerouteMessagesAsync(string targetQueue, List<SQSEvent.SQSMessage> messages) { _logger.RereoutingMessages(messages.Count, targetQueue); var getUrlResponse = await _sqsClient.GetQueueUrlAsync(new GetQueueUrlRequest { QueueName = targetQueue }); string queueUrl = getUrlResponse.QueueUrl; var messageEntries = messages .Select(m => new SendMessageBatchRequestEntry { Id = m.MessageId, MessageBody = m.Body, MessageGroupId = ... , MessageDeduplicationId = ... }).ToList(); var response = await _sqsClient.SendMessageBatchAsync(new SendMessageBatchRequest { QueueUrl = queueUrl, Entries = messageEntries }); _logger.ReroutingOutcome(response.Successful.Count, response.Failed.Count); foreach (var failure in response.Failed) { _logger.ReroutingFailure(failure.Id, failure.Message); } } [GeneratedRegex("[^:]+$")] private static partial Regex ArnResourceId(); }
More complex scenarios may require customization to the various parts of this, such as the method for resolving the SenderId
. For example, while things are straightforward enough for the input
queue, we may need alternative ways to resolve the SenderId
if we’re dealing with the output
queue, which receives messages from an external service (perhaps a session-like property is being used, etc.)
If that’s the case, this can be made into an abstract class, and overrides can be provided in the appropriate places.
Finally, while having a general catch-all exception is normally a no-go, we must have one here. If we simply rethrow an unknown exception, all messages will be marked as errors (and then hopefully routed to a DLQ based on your redrive policies). If a message was handled correctly, we want it to disappear into the ether as it should normally.
Let’s get to a fully implemented example solution.
Implemented Solution
Here we’ll examine a working solution that takes the approach described above and puts it into practice. Here’s the list of components we’re dealing with:
QueuePollingService
, local-only background service used to debug message processingSubmitInputFunction
, contains theFunctionHandlerAsync
for the event source mapped Lambda function and makes use of ourReroutableQueueService
SQS Polling Service for Developers
The first component is a queue polling service used by developers. Lambda functions that execute due to a message arriving in a queue do so because of an event source mapping (more complicated methods are also possible).
There’s no way to get the local code we’re debugging to execute in this fashion, so we need to respond to the message the old-fashioned way: polling.
The QueuePollingService
is an IHostedService
(actually a BackgroundService
, as we don’t require fine-grained control, and it is essentially a long-running async task) that is only added if the code is being run in a non-Lambda environment (i.e., the LAMBDA_TASK_ROOT
environment variable is null). This has the effect of the service only running if we’re debugging the Lambda function code locally, and not after it has been deployed to the cloud.
QueuePollingService.cs
/// <summary> /// Provides an SQS queue polling service for developers. /// </summary> public sealed class QueuePollingService : BackgroundService { private readonly PeriodicTimer _timer = new(TimeSpan.FromSeconds(5)); private readonly string _senderId = Guid.NewGuid().ToString(); private readonly ILogger<QueuePollingService> _logger; private readonly IAmazonSQS _sqsClient; public QueuePollingService(IAmazonSQS sqsClient, ILogger<QueuePollingService> logger) { _sqsClient = sqsClient; _logger = logger; if (OperatingSystem.IsWindows()) _senderId = GetMachineGuid(); // Support for other OS's can be inserted here. } /// <inheritdoc/> public override void Dispose() { _timer.Dispose(); base.Dispose(); } /// <inheritdoc/> protected override async Task ExecuteAsync(CancellationToken stoppingToken) { _logger.PollingServiceStarting(); string inputUrl = await GetQueueUrl($"input-{_senderId}.fifo", stoppingToken); while (!stoppingToken.IsCancellationRequested && await _timer.WaitForNextTickAsync(stoppingToken)) { var receiveRequest = new ReceiveMessageRequest { QueueUrl = inputUrl, WaitTimeSeconds = 0, // Enable short polling...adjust to your needs. }; var receiveMessageResponse = await _sqsClient.ReceiveMessageAsync(receiveRequest, stoppingToken); foreach (var message in receiveMessageResponse.Messages) { // Insert business logic to test here... // Ideally some common service both this and the SubmitInputFunction would execute. // Delete the message after successful processing. var deleteRequest = new DeleteMessageRequest { QueueUrl = inputUrl, ReceiptHandle = message.ReceiptHandle }; await _sqsClient.DeleteMessageAsync(deleteRequest, stoppingToken); } } } private async Task<string> GetQueueUrl(string queueName, CancellationToken cancellationToken) { try { var getRequest = new GetQueueUrlRequest { QueueName = queueName }; var getResponse = await _sqsClient.GetQueueUrlAsync(getRequest, cancellationToken); return getResponse.QueueUrl; } catch (QueueDoesNotExistException) { // Our developer queue doesn't exist yet, let's create it now. var queueAttributes = new Dictionary<string, string> { [QueueAttributeName.FifoQueue] = "true" }; var createRequest = new CreateQueueRequest { QueueName = queueName, Attributes = queueAttributes }; var createResponse = await _sqsClient.CreateQueueAsync(createRequest, cancellationToken); return createResponse.QueueUrl; } } [SupportedOSPlatform("windows")] private static string GetMachineGuid() { using RegistryKey key = Registry.LocalMachine.OpenSubKey(@"SOFTWARE\Microsoft\Cryptography"); return key?.GetValue("MachineGuid")?.ToString() ?? Guid.NewGuid().ToString(); } }
The above service will run on the developers’ machines and will listen for messages in the appropriate queues. Ideally, the GetMachineGuid
and other sender ID resolution logic should be in a common service responsible for submitting messages to the input queue. For this article, I’m keeping things simple.
The polling service presented above can be easily expanded and genericized to allow for polling multiple queues if desired.
Submit-Input Function
Here’s the code for our event source mapped Lambda function.
SubmitInputFunction.cs
/// <summary> /// Provides a Lambda function that process input from the input SQS queue. /// </summary> public sealed class SubmitInputFunction : LambdaFunction { public SubmitInputFunction() : base (services => { services.AddTransient<ReroutableQueueService>(); // Any other required services. }) { } /// <summary> /// The function handler for commanding-input. /// </summary> /// <param name="sqsEvent">SQS messages payload from the commanding input queue.</param> /// <param name="context">The Lambda context.</param> /// <returns>Response containing any SQS messages that could not be processed.</returns> public async Task<SQSBatchResponse> FunctionHandlerAsync(SQSEvent sqsEvent, ILambdaContext context) { SQSBatchResponse response = null; await RunService<ReroutableQueueService>( async (s, _) => response = await s.ProcessMessagesAsync(sqsEvent, ProcessMessage)); return response; } private async Task<bool> ProcessMessage(SQSEvent.SQSMessage message) { // Insert business logic here to submit the input to the external service... // Ideally some common service both this and the QueueProcessingService would execute. } }
This class derives from LambdaFunction
, a class of mine that I base all my Lambda functions for a particular product on, which provides all the necessary boilerplate, such as setting up DI and logging, etc. I’m not including the code here, as it falls outside the scope of the article and will vary significantly based on your requirements.
The above is a simplified form of a real-world implementation, but it is a true and tested technique that allows developers on a team to work on particular queues simultaneously and without issue.
Extra Credit: Cleaning Up
One potential downside to this approach is that your cloud environment will soon be awash in developer-specific queues. There is no particular issue with this circumstance; we are not billed based on the number of SQS queues, but rather on their usage.
Still, if we want to maintain a tidy environment, we can create a custom AWS Systems Manager Document that can be used to clean up any queue created by our redirection service. I’ll provide an example of this in a follow-up article.
Thanks for reading, and keep it real out there.