E-commerce runs on payments. Every “Place order” needs a system that moves money from buyer to seller—reliably, without charging twice or losing track of money. This post walks through how to design that: what to build, how money flows (pay-in vs pay-out), which parts do what, and how to handle failures and retries without double-charging.
Payment Gateway & Payment System: How It Fits Together
A payment system is what moves money in a transaction—the rules, services, and tech that make paying and getting paid work. For a site like Amazon, the payment backend does two main things: (1) take money from the customer when they pay (pay-in), and (2) send money to sellers later (pay-out). If we get it wrong, we can lose money, charge people twice, or lose trust. So we focus on doing it correctly and handling failures well, not just on handling huge volume.
Below we look at what to build, how money flows, which parts do what (payment service, executor, PSP, ledger, wallet), and how to handle failures and retries without charging customers twice.
1. What Are We Building? (Scope and Requirements)
“Payment system” can mean different things: a wallet (Apple Pay, Google Pay), a backend that processes payments (like PayPal or Stripe), or the whole checkout flow. So we need to be clear about scope first.
Questions to clarify:
- What are we building? A payment backend for an e-commerce site. When a customer places an order, this system handles all money movement.
- What payment methods? In real life: credit cards, PayPal, bank cards, etc. Here we use credit card as the main example.
- Do we process cards ourselves? No. We use a third-party payment provider (Stripe, Braintree, Square, etc.—often called a PSP).
- Do we store card numbers? No. For security and compliance (e.g. PCI DSS), the PSP handles card data; we don’t store it.
- How much traffic? Example: 1 million transactions per day.
- Do we pay sellers? Yes. We need to send money from the platform’s account to sellers’ bank accounts (e.g. monthly).
- What if our data and the PSP’s don’t match? We have many internal services (accounting, ledger) and external ones (PSP, banks). When something fails, our records can get out of sync. We need reconciliation—periodically comparing and fixing mismatches.
From that we get a clear list of what the system must do (functional) and how reliable it must be (non-functional).
What the system must do (functional)
- Pay-in: Take money from the customer (e.g. charge the card) and have it land in the platform’s account.
- Pay-out: Send money from the platform’s account to sellers (e.g. to their bank accounts).
How reliable it must be (non-functional)
- Reliability: When a payment fails, we handle it clearly—retry when it makes sense, put repeatedly failing ones in a “failed” queue for people to check, and keep a clear status for each payment.
- Reconciliation: Regularly compare our records (payment system, accounting, ledger) with the PSP’s and banks’, and fix any mismatches.
Rough scale (back-of-the-envelope)
- 1 million transactions per day → about 10 transactions per second (1,000,000 ÷ 100,000 seconds).
- 10 per second is not huge for a database. The hard part is getting every payment right and keeping data consistent, not handling massive traffic.
2. Big Picture: Pay-In and Pay-Out
The design has two main flows, matching how money actually moves.
Pay-in vs pay-out (e.g. Amazon)
- Pay-in: Buyer pays → money goes from the buyer’s card to the platform’s bank account. The platform holds the money; the seller’s share is tracked internally (e.g. in a wallet/ledger).
- Pay-out: When conditions are met (e.g. goods delivered), the platform sends the seller’s share from the platform’s account to the seller’s bank account.
So: pay-in = money in; pay-out = money out to sellers.
flowchart LR
subgraph PayIn["Pay-in"]
B1[Buyer Card] -->|Charge| P[Platform Bank Account]
end
subgraph PayOut["Pay-out"]
P -->|Transfer when conditions met| S[Seller Bank Account]
end
P --> W[(Wallet / Ledger)]
W -.Track seller balance.-> S
3. Pay-In Flow: Parts and Steps
The pay-in flow uses several parts that work together.
3.1. Main parts
| Part | What it does |
|---|---|
| Payment service | Receives payment requests, runs risk checks (e.g. anti-fraud, anti–money laundering), and runs the whole flow. Only continues if the payment passes checks. |
| Payment executor | Does one payment order at a time via the PSP. One “place order” can have several orders (e.g. cart with items from different sellers). |
| Payment Service Provider (PSP) | The company that actually moves the money (e.g. charges the card). They talk to card networks (Visa, Mastercard, etc.) and banks. |
| Card schemes | Visa, Mastercard, Discover, etc.—the networks that run card payments. |
| Ledger | The book of all transactions (who paid whom, how much). Used for reports, revenue, and reconciliation. |
| Wallet | Stores each seller’s balance (and can track how much a user has paid in total). |
3.2. Step-by-step pay-in
- User clicks “Place order” → we create a payment event and send it to the payment service.
- Payment service saves the event and runs risk checks (e.g. anti-fraud).
- One event can include several payment orders (e.g. one checkout with items from several sellers). For each order, the payment service calls the payment executor.
- Payment executor saves the order and asks the PSP to charge the card.
- If that succeeds, the payment service updates the wallet (seller’s balance).
- Then it updates the ledger (double-entry: debit buyer, credit seller).
- When all orders in that checkout are done, we mark the payment event complete. A scheduled job can watch for orders that are stuck and alert the team.

4. APIs and Data Stored
4.1. Payment API
POST /v1/payments — Create and run a payment (one “place order”).
| Field | Description | Type |
|---|---|---|
| buyer_info | Buyer information | json |
| checkout_id | Globally unique checkout ID | string |
| credit_card_info | Encrypted card data or PSP token | json |
| payment_orders | List of payment orders | list |
Each payment_order:
| Field | Description | Type |
|---|---|---|
| seller_account | Who receives the money | string |
| amount | Transaction amount | string |
| currency | ISO 4217 currency | string |
| payment_order_id | Unique ID for this payment; also used so we never charge twice (idempotency key) | string |
Why is amount a string? So we don’t run into rounding errors with decimals across different systems, and we can support very large or very small amounts. We store and send it as text; we only convert to numbers when we show or calculate.
GET /v1/payments/{:id} — Get the status of a payment order.
4.2. What we store (payment service)
Two main tables: payment event (one per “place order”) and payment order (one per seller/order in that checkout). For payments we care more about stability and clear audit trail than raw speed, so a normal relational database with transactions (ACID) is a good fit.
Payment event
| Name | Type |
|---|---|
| checkout_id | string (PK) |
| buyer_info | string |
| seller_info | string |
| credit_card_info | (PSP-dependent) |
| is_payment_done | boolean |
Payment order
| Name | Type |
|---|---|
| payment_order_id | string (PK) |
| buyer_account | string |
| amount | string |
| currency | string |
| checkout_id | string (FK) |
| payment_order_status | enum: NOT_STARTED, EXECUTING, SUCCESS, FAILED |
| wallet_updated | boolean |
| ledger_updated | boolean |
Status goes: NOT_STARTED → EXECUTING → SUCCESS or FAILED. When we get SUCCESS, we update the wallet and then the ledger. When every order in that checkout is successful, we set is_payment_done to true.
stateDiagram-v2
[*] --> NOT_STARTED
NOT_STARTED --> EXECUTING: Payment executor starts
EXECUTING --> SUCCESS: PSP confirms
EXECUTING --> FAILED: PSP rejects / timeout
SUCCESS --> [*]: Wallet + Ledger updated
FAILED --> [*]: Alert / retry or DLQ
5. Double-Entry Ledger and Hosted Payment Page
5.1. Double-entry ledger (two sides to every transaction)
Every transaction is written twice with the same amount: one side is a debit, the other a credit. Example: user pays a seller $1:
| Account | Debit | Credit |
|---|---|---|
| buyer | $1 | |
| seller | $1 |
The total of all entries is always zero (every debit has a matching credit). That way we can trace every cent and keep the books consistent. In code we often use a ledger where we only ever add new rows and never change or delete old ones.
5.2. Hosted payment page (security and PCI)
Most companies do not store real card numbers (PCI DSS rules are strict). Instead they use a hosted payment page from the PSP: the customer types their card details on the PSP’s page (or in the PSP’s app/SDK). Our system never sees the card number. The PSP gives us a token or confirmation; we only store that token and use it for later charges or for reconciliation.
6. Pay-Out Flow
Pay-out is the opposite direction of pay-in:
- Pay-in: PSP moves money from buyer’s card → platform’s bank account.
- Pay-out: A pay-out provider (e.g. Tipalti or similar) moves money from the platform’s bank account → each seller’s bank account. This usually runs when conditions are met (e.g. goods delivered, or a monthly batch).
We don’t do pay-out ourselves because it comes with a lot of rules and paperwork. Pay-out providers handle things like: tax (withholding, reporting, different rules per country), compliance (anti-money laundering, know-your-customer for sellers), reporting (statements, audit trails), and multi-currency / international transfers (exchange rates, local payment methods, bank details in many countries). Using a dedicated provider keeps the platform out of that complexity and regulatory risk.
7. Making It Reliable: Failures, Retries, No Double Charge
In real systems, things fail and users sometimes click “Pay” twice. The design has to handle: duplicate clicks, timeouts, retries, and keeping our data in sync with the PSP’s.
7.1. Hosted payment page: step-by-step
With a hosted payment page:
- Our app calls the payment service with order info (no card data).
- Payment service registers the payment with the PSP: amount, currency, where to redirect the user after pay, and a unique ID (often the payment_order_id) so we only register once.
- PSP returns a token; we save it.
- We show the user the PSP’s payment page (or use the PSP’s SDK), with that token and the redirect URL. The user enters their card on the PSP’s page; the PSP charges the card.
- When done, the PSP sends the user back to our site (e.g. with success/fail in the URL).
- In the background, the PSP calls our webhook (a URL we gave them) with the final status. We update the payment status in our database from that webhook.
sequenceDiagram
participant C as Client
participant PS as Payment Service
participant PSP as PSP Hosted Page
participant U as User
C->>PS: Checkout (order info, no card)
PS->>PS: Register with PSP (nonce = payment_order_id)
PS->>PSP: Create payment session
PSP-->>PS: Token
PS-->>C: Token + redirect URL
C->>U: Show PSP-hosted page (token, redirect URL)
U->>PSP: Enter card details, Pay
PSP->>PSP: Process payment
PSP-->>U: Redirect to merchant URL (result in query)
PSP->>PS: Webhook: final status
PS->>PS: Update payment_order_status
Because networks and systems can fail, we still need reconciliation to make sure our records match the PSP’s.
7.2. Reconciliation
Reconciliation means we regularly compare our records (ledger, wallet) with the PSP’s and the bank’s. Banks and PSPs often send a settlement file every day (list of transactions and balances). We load that and compare it to our ledger. When something doesn’t match, we either fix it automatically, send it to the finance team to fix, or escalate for investigation. This is how we catch and fix mismatches.
flowchart LR
subgraph Internal["Our systems"]
L[Ledger]
W[Wallet]
end
subgraph External["External"]
SF[Settlement file from PSP/Bank]
end
SF --> R[Reconciliation Job]
L --> R
W --> R
R --> C{Match?}
C -->|Yes| OK[OK]
C -->|No| M[Mismatch]
M --> A[Auto-correct]
M --> Q[Queue for finance]
M --> E[Escalate]
7.3. When payments take a long time (e.g. 3D Secure, manual review)
Some payments don’t finish in a few seconds (e.g. extra security step like 3D Secure, or the PSP flags it for human review). The PSP gives us a “pending” status; we show that to the user and can offer a “check status” page. When the result is ready, the PSP tells us via webhook (or we poll them). So we have to support payments that stay “pending” for a while.
7.4. Calling other services: wait for reply vs fire-and-forget
- Wait for reply (synchronous, e.g. HTTP): Simple, but if one service is slow or down, the whole request waits or fails.
- Fire-and-forget (asynchronous, e.g. message queue like Kafka): We send a message and other services process it when they can. One payment can trigger several things (run payment, update analytics, send notification, etc.). For a payment system with many dependencies, this approach often scales and fails more gracefully.
7.5. When a payment fails: retry and “failed” queue
- Clear status at every step: We save the payment status at each stage. Then we always know whether to retry or refund.
- Retry queue: If the failure looks temporary (e.g. network blip), we put the request in a retry queue. Workers try again later, with increasing delay between attempts.
- Dead letter queue (DLQ): If we’ve retried too many times, we move the request to a failed queue (DLQ) for people to look at. That way we don’t retry forever and we can isolate bad or stuck requests.
flowchart LR
P[Payment request] --> F{Fail?}
F -->|No| S[Success]
F -->|Yes| T{Retryable?}
T -->|No| DB[(Store error)]
T -->|Yes| RQ[Retry queue]
RQ --> W[Worker retry]
W --> F
W -->|Retry count > N| DLQ[Dead letter queue]
DLQ --> INV[Investigation]
7.6. No double charges: “exactly once”
We want each payment to be charged exactly once—not zero times (lost) and not twice (double charge). We get there with two ideas:
- Don’t miss a charge: Use retries (with backoff) so we eventually process the payment if something failed temporarily.
- Don’t charge twice: Use idempotency: the same request always leads to the same result. If the user (or app) sends the same request again, we don’t charge again—we just return the result we already have.
How idempotency works: The client sends a unique idempotency key with each payment (e.g. checkout_id or payment_order_id). If we’ve already seen that key, we don’t charge again; we return the stored result. We also send that same key to the PSP so they can deduplicate on their side. In the database, we use a unique constraint on that key so we can only have one row per payment.
So: retries make sure we don’t lose a payment; idempotency makes sure we don’t charge twice. Together = exactly once from the customer’s point of view.
flowchart LR
R[Request + Idempotency-Key] --> C{Key seen before?}
C -->|Yes| Return[Return cached result]
C -->|No| Insert[Insert row with key]
Insert --> Process[Process payment]
Process --> Store[Store result]
Store --> Return
7.7. Keeping our data in sync
- Inside our system: Use idempotency + retries and a clear status for each order so we process each payment exactly once.
- With the PSP: Use the same idempotency key when we retry; plus reconciliation because the PSP’s data can be delayed or wrong sometimes.
- If we use database replicas: We either read and write only from the primary, or use a database that keeps replicas in sync, so we don’t make decisions on outdated data.
7.8. Security in short
| Risk | What we do |
|---|---|
| Someone reads or changes data in transit | HTTPS, encryption, integrity checks |
| Someone intercepts between client and server | TLS, certificate pinning |
| Data loss | Replication, backups, snapshots |
| Too many requests (DDoS) | Rate limiting, firewall |
| Storing card numbers / PCI | Don’t store real card numbers; use tokens and hosted payment page (or a compliant vault) |
| Fraud | Address checks (AVS), CVV, risk scoring, behavior analysis |
8. Wrap-Up
A payment gateway for e-commerce is built around pay-in (money from customer to platform) and pay-out (money from platform to sellers). The main ideas:
- Be clear on scope: Who charges, who gets paid, who holds the money in between.
- Use a PSP and hosted payment page: Let the PSP handle card data and PCI; we use tokens and webhooks.
- Double-entry ledger and wallet to keep internal books and seller balances right.
- Reconciliation with the PSP and banks to find and fix mismatches.
- No double charges: idempotency keys (e.g. payment_order_id) plus retries and clear payment status.
- Handle failures well: Retry queue, failed queue (DLQ), clear status at each step, and support for payments that stay “pending” for a while.
Traffic (e.g. 10 payments per second) is not the main challenge; the hard part is getting every payment right and keeping data consistent. This post summarized how to think about designing a reliable payment gateway and backend.