Our real-time data pipeline was failing due to mysterious throughput limits. The culprit? A little-known Kinesis Data Firehose limitation that has nothing to do with shards or provisioned throughput. In this scenario, we expose the undocumented partition key limit that nobody talks about.
Introduction to Kinesis Data Firehose
Kinesis Data Firehose is a fully managed service that captures, transforms, and loads data into Amazon S3, Amazon Redshift, Amazon Elasticsearch, and Splunk. It's commonly used for real-time data processing and analytics.
import { PutRecordCommand } from '@aws-sdk/client-kinesis';
import { KinesisClient } from '@aws-sdk/client-kinesis';
const kinesisClient = new KinesisClient({ region: 'us-west-2' });
const command = new PutRecordCommand({
StreamName: 'my-stream',
Records: [
{
Data: Buffer.from('Hello World'),
PartitionKey: 'pk-1',
},
],
});
kinesisClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
Be aware that Kinesis shard limits (1MB/s write, 2MB/s read) can catch teams off-guard at scale. For example, the following error may occur:
ProvisionedThroughputExceededException: Rate exceeded for shard id ... in stream ... under account ...
The Mysterious Throughput Limit
We were experiencing a throughput limit in our real-time analytics pipeline, but it wasn't related to the Kinesis shard limits. The error message was InternalFailure: Internal server error, which wasn't very informative.
import { PutRecordCommand } from '@aws-sdk/client-kinesis';
import { KinesisClient } from '@aws-sdk/client-kinesis';
const kinesisClient = new KinesisClient({ region: 'us-west-2' });
const command = new PutRecordCommand({
StreamName: 'my-stream',
Records: [
{
Data: Buffer.from('Hello World'),
PartitionKey: 'pk-1',
},
],
});
kinesisClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
if (err.name === 'InternalFailure') {
console.log('Internal server error occurred');
}
});
When dealing with Kinesis errors, remember that GetRecords returns empty even when data exists due to eventual propagation. This can lead to confusing behavior if not properly handled.
Partition Key Limits: The Hidden Culprit
After further investigation, we discovered that Kinesis Data Firehose has a fixed limit of 1000 active partition keys. This limit can cause throughput limits and record failures if not properly handled.
import { PutRecordCommand } from '@aws-sdk/client-kinesis';
import { KinesisClient } from '@aws-sdk/client-kinesis';
const kinesisClient = new KinesisClient({ region: 'us-west-2' });
const command = new PutRecordCommand({
StreamName: 'my-stream',
Records: [
{
Data: Buffer.from('Hello World'),
PartitionKey: 'pk-1001', // exceeds the limit
},
],
});
kinesisClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
if (err.name === 'KinesisException') {
console.log('Partition key limit exceeded');
}
});
Be aware that shard iterator expiration after 5 minutes can cause consumer failures. This can be mitigated by periodically renewing the iterator.
Redesigning the Pipeline
To avoid the partition key limit, we redesigned our pipeline to use a hashing function that distributes the partition keys across multiple streams.
import { PutRecordCommand } from '@aws-sdk/client-kinesis';
import { KinesisClient } from '@aws-sdk/client-kinesis';
import * as crypto from 'crypto';
const kinesisClient = new KinesisClient({ region: 'us-west-2' });
const hashingFunction = (data: string) => {
const hash = crypto.createHash('sha256');
hash.update(data);
return hash.digest('hex').slice(0, 10);
};
const command = new PutRecordCommand({
StreamName: 'my-stream',
Records: [
{
Data: Buffer.from('Hello World'),
PartitionKey: hashingFunction('Hello World'),
},
],
});
kinesisClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
When redesigning the pipeline, remember that Kinesis doesn't support message filtering like SQS/EventBridge. This can lead to increased processing costs if not properly handled.
Best Practices for Avoiding the Partition Key Trap
To avoid the partition key limit, follow these best practices:
- Use a hashing function to distribute the partition keys across multiple streams.
- Monitor the partition key count and adjust the hashing function accordingly.
- Use a combination of Kinesis streams and Lambda functions to process and transform the data.
import { PutRecordCommand } from '@aws-sdk/client-kinesis';
import { KinesisClient } from '@aws-sdk/client-kinesis';
import * as crypto from 'crypto';
const kinesisClient = new KinesisClient({ region: 'us-west-2' });
const hashingFunction = (data: string) => {
const hash = crypto.createHash('sha256');
hash.update(data);
return hash.digest('hex').slice(0, 10);
};
const command = new PutRecordCommand({
StreamName: 'my-stream',
Records: [
{
Data: Buffer.from('Hello World'),
PartitionKey: hashingFunction('Hello World'),
},
],
});
kinesisClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
Be aware that require(esm) in Node 22 breaks existing Lambda layers silently. This can lead to unexpected behavior if not properly handled.
The Takeaway
Here are some key takeaways from our experience with Kinesis Data Firehose:
- Kinesis Data Firehose has a fixed limit of 1000 active partition keys, which can cause throughput limits and record failures if not properly handled.
- Use a hashing function to distribute the partition keys across multiple streams to avoid the limit.
- Monitor the partition key count and adjust the hashing function accordingly.
- Be aware of Kinesis shard limits (1MB/s write, 2MB/s read) and shard iterator expiration after 5 minutes.
- Use a combination of Kinesis streams and Lambda functions to process and transform the data, but be aware of the limitations of Lambda@Edge and provisioned concurrency costs.
Transparency notice
AI-crafted with Groq, powered by LLaMA 3.3 70B.
The topic was scouted from live AWS and Node.js ecosystem signals, and the content โ
including all code examples โ was written autonomously without human editing.Published: 2026-06-17 ยท Primary focus: Kinesis
All code blocks are intended to be correct and runnable, but please verify them
against the official AWS SDK v3 docs
before using in production.Find an error? Drop a comment โ corrections are always welcome.
Top comments (0)