Cracking Android SDE2/SDE3 Interviews in 2026: Deep Dives, Code, Follow-ups | Part-7

     

71. Design a Production-Ready Stopwatch Using MVVM (No Libraries)

Architectural Approach

For a production stopwatch, correctness and lifecycle safety matter more than UI frequency. The key principles are:

  1. Time source must be monotonic
     Use SystemClock.elapsedRealtime() — not currentTimeMillis() — to avoid issues with:
  • device clock changes
  • NTP sync
  • timezone updates

2. State must survive configuration & process death

  • UI state → StateFlow
  • Persist critical timing data → SavedStateHandle

3. Ticker should be lifecycle-aware

  • Use viewModelScope
  • Cancel explicitly on pause/reset
  • Avoid leaking coroutines

4. UI refresh ≠ time precision

  • Time precision is millisecond-accurate
  • UI updates can be throttled (e.g., 16ms for ~60fps)

Data Model (Immutable, UI-friendly)

@Immutable
data class StopwatchState(
val elapsedMs: Long = 0L,
val isRunning: Boolean = false
)

Immutable state ensures:

  • predictable recomposition
  • easy testing
  • no race conditions in UI

ViewModel (Core Logic)

@HiltViewModel
class StopwatchViewModel @Inject constructor(
private val savedStateHandle: SavedStateHandle
) : ViewModel() {
private val _state = MutableStateFlow(StopwatchState())
val state: StateFlow<StopwatchState> = _state.asStateFlow()
private var tickerJob: Job? = null
// Time already elapsed before the last start
private var accumulatedTime: Long =
savedStateHandle["accumulated_time"] ?: 0L
// Timestamp when current run started
private var startTime: Long = 0L

fun start() {
if (_state.value.isRunning) return
startTime = SystemClock.elapsedRealtime()
tickerJob = viewModelScope.launch {
while (isActive) {
delay(16L) // ~60fps UI refresh
val elapsed =
SystemClock.elapsedRealtime() - startTime + accumulatedTime
_state.value = StopwatchState(
elapsedMs = elapsed,
isRunning = true
)
}
}
}

fun pause() {
if (!_state.value.isRunning) return
tickerJob?.cancel()
tickerJob = null
accumulatedTime += SystemClock.elapsedRealtime() - startTime
savedStateHandle["accumulated_time"] = accumulatedTime
_state.value = _state.value.copy(isRunning = false)
}

fun reset() {
tickerJob?.cancel()
tickerJob = null
accumulatedTime = 0L
savedStateHandle["accumulated_time"] = 0L
_state.value = StopwatchState()
}

override fun onCleared() {
super.onCleared()
tickerJob?.cancel()
}
}

Why This Works in Production

✅ Accuracy

  • Uses elapsedRealtime() → immune to clock changes
  • Millisecond precision independent of UI refresh rate

✅ Lifecycle Safety

  • SavedStateHandle survives:
  • rotation
  • background kill
  • low-memory process death

✅ Performance

  • Single coroutine
  • No busy loops
  • UI updates throttled to 16ms instead of every millisecond

✅ Testability

  • No Android UI dependencies
  • Fully testable with virtual time

Unit Test (Deterministic & Fast)

@Test
fun stopwatch_emits_correct_elapsed_time() = runTest {
val viewModel = StopwatchViewModel(SavedStateHandle())
viewModel.start()
advanceTimeBy(1_000)
val elapsed = viewModel.state.value.elapsedMs
assertEquals(1_000, elapsed)
}

This works because:

  • delay() respects virtual time
  • elapsedRealtime() is abstracted by test scheduler

Production Extensions (What I’d Mention as an Architect)

  • Background stopwatch
  • Promote logic to a Foreground Service
  • Expose time via StateFlow or binder
  • Battery optimization
  • Drop UI updates to 250ms when app is backgrounded
  • Multi-stopwatch support
  • Store state per ID
  • Persistence across reboot
  • Persist accumulated time to DB + reboot timestamp

Interview One-Line Summary

I model the stopwatch as a monotonic time accumulator using elapsedRealtime, expose immutable state via StateFlow, persist only essential timing data in SavedStateHandle, and control execution with a lifecycle-aware coroutine ticker. The design is accurate, testable, and production-safe.

72. Design Search with Debounce, Pagination, and Cache

Problem Framing (What the Interviewer Is Really Asking)

A production search system must:

  1. Avoid over-fetching (debounce + cancellation)
  2. Scale with large result sets (pagination)
  3. Be fast and offline-capable (local cache)
  4. Handle rapid query changes safely (Flow + flatMapLatest)
  5. Minimize network usage (cache-first strategy)

High-Level Architecture

MVVM + Repository + Paging 3 + Room (FTS5)

UI
└── ViewModel (debounce, query state)
└── Repository (cache-first orchestration)
├── Room FTS (instant, offline)
└── Remote API (pagination + refresh)

Key design choice:

  • Room is the single source of truth
  • Network updates the cache
  • Paging reads only from DB

This is the same pattern used in large-scale apps (Google, Airbnb, Uber).

ViewModel: Debounced, Cancelable Search

class SearchViewModel(
private val repository: SearchRepository
) : ViewModel() {
private val searchQuery = MutableStateFlow("")
val searchResults: Flow<PagingData<SearchResult>> =
searchQuery
.map { it.trim() }
.filter { it.length > 2 } // avoid noisy queries
.debounce(300) // UX sweet spot
.distinctUntilChanged()
.flatMapLatest { query ->
repository.search(query)
}
.cachedIn(viewModelScope)
fun onQueryChanged(query: String) {
searchQuery.value = query
}
}

Why this works

  • debounce(300) → prevents API spam
  • flatMapLatest → cancels in-flight searches
  • cachedIn(viewModelScope) → survives rotation
  • No manual job handling needed

Repository: Cache-First, Network-Backed Paging

class SearchRepository(
private val api: SearchApi,
private val dao: SearchDao,
private val db: AppDatabase
) {
fun search(query: String): Flow<PagingData<SearchResult>> {
return Pager(
config = PagingConfig(
pageSize = 20,
prefetchDistance = 5,
enablePlaceholders = false
),
remoteMediator = SearchRemoteMediator(
query = query,
api = api,
dao = dao,
db = db
),
pagingSourceFactory = {
dao.searchPagingSource(query)
}
).flow
}
}

Why Paging from DB (not network)?

  • Instant first paint (cached results)
  • Offline support
  • Consistent ordering
  • No duplicated pagination logic

Room: FTS5 for Fast Local Search

@Entity(tableName = "search_results")
data class SearchEntity(
@PrimaryKey val id: String,
val title: String,
val description: String
)

@Fts5(contentEntity = SearchEntity::class)
@Entity(tableName = "search_fts")
data class SearchFtsEntity(
val title: String,
val description: String
)
@Dao
interface SearchDao {
@Query("""
SELECT search_results.*
FROM search_fts
JOIN search_results ON search_fts.rowid = search_results.rowid
WHERE search_fts MATCH :query
ORDER BY rank
"""
)

fun searchPagingSource(query: String): PagingSource<Int, SearchResult>
@Insert(onConflict = OnConflictStrategy.REPLACE)
suspend fun insertAll(results: List<SearchEntity>)
}

Data Point

  • FTS5 queries are ~10–50x faster than LIKE '%query%'
  • Works fully offline
  • Ranked relevance out of the box

RemoteMediator: Cache Then Network

@OptIn(ExperimentalPagingApi::class)
class SearchRemoteMediator(
private val query: String,
private val api: SearchApi,
private val dao: SearchDao,
private val db: AppDatabase
) : RemoteMediator<Int, SearchEntity>() {
override suspend fun load(
loadType: LoadType,
state: PagingState<Int, SearchEntity>
)
: MediatorResult {
val page = when (loadType) {
LoadType.REFRESH -> 1
LoadType.APPEND -> state.pages.size + 1
LoadType.PREPEND -> return MediatorResult.Success(true)
}
return try {
val response = api.search(query, page, state.config.pageSize)
db.withTransaction {
if (loadType == LoadType.REFRESH) {
dao.clearForQuery(query)
}
dao.insertAll(response.items)
}
MediatorResult.Success(
endOfPaginationReached = response.items.isEmpty()
)
} catch (e: Exception) {
MediatorResult.Error(e)
}
}
}

Why RemoteMediator

  • Clean separation of concerns
  • Handles refresh vs append correctly
  • Guarantees DB consistency
  • Avoids “cache vs network” race conditions

Performance & Data Metrics (What Impresses Interviewers)

  • ~80% cache hit rate after first search
  • <50ms local FTS query latency
  • Zero network calls for repeated queries
  • Offline search fully functional
  • API calls reduced by 60–70% due to debounce + cache

Testing Strategy

  • ViewModel: test debounce + cancellation using runTest
  • Repository: fake API + in-memory Room
  • Paging: verify refresh + append behavior
  • FTS: verify ranking and tokenization

Architect-Level Summary (Say This Out Loud)

I debounce and cancel user input using Flow, page results using Paging 3, and treat Room FTS as the single source of truth. The network only refreshes the cache via a RemoteMediator, giving instant results, offline support, and a high cache-hit ratio with minimal API usage.

73. Check if Two Strings Are Anagrams (Efficiently)

How I’d Frame It in an Interview

An anagram check is fundamentally about character frequency equality.
 The optimal solution depends on the character set, not just string length.

I usually explain two approaches, then pick the optimal one based on constraints.

Approach 1: Frequency Counting (Optimal for Known Alphabets)

When to Use

  • Alphabet is bounded (ASCII, lowercase English letters)
  • Performance matters
  • Large input sizes

Time & Space Complexity

  • Time: O(n)
  • Space: O(1) (constant-sized array)

Implementation (Lowercase English Letters)

fun areAnagrams(s1: String, s2: String): Boolean {
if (s1.length != s2.length) return false

val counts = IntArray(26)
for (i in s1.indices) {
counts[s1[i] - 'a']++
counts[s2[i] - 'a']--
}
return counts.all { it == 0 }
}

Why This Works

  • Single pass over both strings
  • Increment/decrement guarantees zero balance if characters match
  • Avoids sorting overhead and extra allocations

Data Point

For strings of length 1M:

  • Counting approach is ~3–5× faster than sorting
  • Zero object allocation → GC-friendly

Approach 2: Sorting (General but Slower)

When to Use

  • One-off checks
  • No strict performance constraints
  • Quick readability
fun areAnagramsBySorting(s1: String, s2: String): Boolean {
if (s1.length != s2.length) return false
return s1.toCharArray().sorted() == s2.toCharArray().sorted()
}

Complexity

  • Time: O(n log n)
  • Space: O(n)
  • Creates multiple intermediate objects

Unicode / Internationalization Case (Production-Ready)

Problem

  • Unicode characters are not bounded (emojis, CJK, accents)
  • Fixed-size array no longer works

Correct Approach: HashMap Frequency Count

fun areAnagramsUnicode(s1: String, s2: String): Boolean {
if (s1.length != s2.length) return false

val freq = HashMap<Char, Int>(s1.length)
for (c in s1) {
freq[c] = (freq[c] ?: 0) + 1
}
for (c in s2) {
val count = freq[c] ?: return false
if (count == 1) freq.remove(c) else freq[c] = count - 1
}
return freq.isEmpty()
}

Trade-offs

  • Time: O(n)
  • Space: O(k) where k = distinct characters
  • Slightly slower than array due to hashing, but correct for Unicode

Edge Cases I’d Call Out

  • Different lengths → early exit
  • Empty strings → valid anagrams
  • Case sensitivity (normalize if needed)
  • Whitespace / punctuation (strip if required by spec)

Example normalization:

val normalized = input.lowercase().filter { it.isLetterOrDigit() }

Tests

@Test
fun testAnagrams() {
assertTrue(areAnagrams("listen", "silent"))
assertFalse(areAnagrams("hello", "world"))
assertTrue(areAnagramsUnicode("猫犬", "犬猫"))
}

Interview-Ready One-Liner

If the character set is bounded, I use a constant-size frequency array for an O(n) solution. For Unicode or internationalized input, I switch to a HashMap-based frequency count. Sorting works but is suboptimal due to O(n log n) cost and extra allocations.

74. Meeting Rooms II — Minimum Number of Rooms

Problem Restatement

Given a list of meeting intervals [start, end), determine the minimum number of rooms required so that no meetings overlap.

This is fundamentally a peak concurrency problem.

Core Insight (What You Should Say First)

At any point in time, the number of rooms required equals the maximum number of overlapping meetings.
 So the problem reduces to tracking how many meetings are happening simultaneously.

Optimal Approach 1: Min-Heap of End Times (Most Common)

Idea

  1. Sort meetings by start time
  2. Use a min-heap to track the earliest ending meeting
  3. Reuse a room if the earliest end ≤ current start

Time & Space

  • Time: O(n log n)
  • Space: O(n)
  • Works comfortably up to ~1⁰⁵ intervals

Correct, Production-Quality Implementation

⚠️ The candidate’s version is buggy:
 It sorts starts but then does a linear search to find the matching end, which breaks with duplicates and degrades to O(n²).
fun minMeetingRooms(intervals: Array<IntArray>): Int {
if (intervals.isEmpty()) return 0
// Sort by start time
intervals.sortBy { it[0] }
val minHeap = PriorityQueue<Int>() // stores end times
minHeap.offer(intervals[0][1])
for (i in 1 until intervals.size) {
val start = intervals[i][0]
val end = intervals[i][1]
// Reuse room if possible
if (minHeap.peek() <= start) {
minHeap.poll()
}
minHeap.offer(end)
}
return minHeap.size
}

Why This Works

  • Heap size at any moment = rooms currently in use
  • Final heap size = max simultaneous meetings
  • Each meeting is pushed and popped once

Alternative Approach 2: Sweep Line (Even Cleaner)

Idea

  • Split all intervals into start and end events
  • Sort both lists
  • Walk them with two pointers

Complexity

  • Time: O(n log n)
  • Space: O(n)
  • No heap → fewer allocations → often faster in practice
fun minMeetingRooms(intervals: Array<IntArray>): Int {
val starts = IntArray(intervals.size)
val ends = IntArray(intervals.size)

for (i in intervals.indices) {
starts[i] = intervals[i][0]
ends[i] = intervals[i][1]
}
starts.sort()
ends.sort()
var rooms = 0
var endPtr = 0
for (start in starts) {
if (start < ends[endPtr]) {
rooms++
} else {
endPtr++
}
}
return rooms
}

Data Point

  • Sweep line is typically ~20–30% faster than heap-based solutions for large n
  • Excellent cache locality

Scaling Discussion (What to Say If Pushed)

Up to 1⁰⁵ intervals

  • Heap or sweep line is sufficient

Very large ranges / streaming input

  • Event counting
  • Segment tree / coordinate compression
  • Useful if:
  • Time range is bounded
  • You need dynamic updates (add/remove meetings)

Edge Cases to Call Out

  • Empty input → 0 rooms
  • Meetings ending exactly when another starts → reuse room
  • Identical start times → separate rooms

Interview-Ready Summary

I sort meetings by start time and use a min-heap to track the earliest ending meeting, reusing rooms whenever possible. The heap size gives the maximum overlap, which is the minimum number of rooms required. Time complexity is O(n log n), and a sweep-line variant can reduce overhead further.

75. Non-Functional Requirement: Reduce Battery Consumption (Android)

How I Frame This as an Architect

Battery is not a single optimization — it’s the sum of wakeups, radios, CPU time, and bad scheduling.
 The goal is to do less work, less often, and at the right time, while respecting system policies like Doze and App Standby.

I usually group strategies into four buckets.

1️⃣ Defer & Batch Work (Biggest Win)

Principle

  • Wakeups are expensive
  • Doing 10 small tasks costs more battery than doing 1 batched task

Correct Tool: WorkManager

val constraints = Constraints.Builder()
.setRequiresCharging(true)
.setRequiresDeviceIdle(true)
.setRequiredNetworkType(NetworkType.UNMETERED)
.build()

val workRequest =
PeriodicWorkRequestBuilder<SyncWorker>(15, TimeUnit.MINUTES)
.setConstraints(constraints)
.build()
WorkManager.getInstance(context).enqueue(workRequest)

Why This Matters

  • Lets the OS batch work across apps
  • Automatically respects Doze & App Standby
  • Avoids custom scheduling bugs

📉 Typical gain: 20–40% reduction in background drain

2️⃣ Location: Batch, Don’t Poll

Bad

  • High-frequency GPS polling
  • Manual timers

Correct: Fused Location Provider (Batching)

val request = LocationRequest.Builder(
Priority.PRIORITY_BALANCED_POWER_ACCURACY,
10_000L
)
.setMinUpdateIntervalMillis(60_000L)
.setMaxUpdateDelayMillis(5 * 60_000L) // batch updates
.build()

Why

  • GPS is one of the top battery killers
  • Batching lets the chipset sleep longer
  • Balanced accuracy is often 5–10× cheaper than high accuracy

3️⃣ Network: Reduce Radio Wakeups

Principles

  • Radios are expensive to wake
  • One large request is cheaper than many small ones

Strategies

  • Batch API calls
  • Compress payloads
  • Cache aggressively
// Single batched endpoint instead of multiple calls
POST /sync
{
"events": [...],
"metrics": [...],
"logs": [...]
}

With OkHttp:

  • Enable HTTP/2
  • Connection pooling
  • Gzip enabled by default

📉 Observed gain: ~15–25% battery savings in data-heavy apps

4️⃣ Wake Locks: Avoid Direct Usage

Rule

If you think you need a WakeLock, you probably don’t.

Preferred

  • WorkManager
  • AlarmManager (exact alarms only if justified)
  • Foreground service only for user-visible work
// Alarm triggers → WorkManager handles execution safely
AlarmManager.setExactAndAllowWhileIdle(...)

Why:

  • Prevents runaway wakelocks
  • OS manages lifetime and batching
  • Safer across OEMs

Measurement & Validation (This Is What Impresses)

  • Battery Historian → verify wakeups, radio usage
  • adb bugreport
  • Android Studio Energy Profiler

Example NFR Target

<1% battery drain per hour during background operation

Validated using:

  • 24-hour idle test
  • Doze + App Standby enabled

Architect Summary (Say This)

Battery optimization is about deferring and batching work, minimizing radio and GPS usage, and letting the OS schedule execution. I rely on WorkManager with constraints, batch network and location updates, avoid direct wake locks, and validate results with Battery Historian against defined drain targets.

76. Design a URL Shortener (TinyURL-Scale System)

Clarify Requirements First (Architect Move)

Functional

  • Shorten URL
  • Redirect fast

Non-Functional

  • Massive scale (billions)
  • Low latency
  • High availability
  • No collisions
  • Cache-friendly

Core Design Insight

This is a read-heavy system:

  • Writes: once per URL
  • Reads: potentially millions per URL

So:

  • Cache first
  • Shard by key
  • CDN at the edge

Short Code Strategy

Base62 Encoding

Character set:

[a–z][A–Z][0–9]62 chars

Keyspace:

62⁶ ≈ 56 billion URLs

Enough for years at internet scale.

Data Flow

Client
→ API Gateway
→ URL Service
→ Redis (hot cache, ~90%)
→ Distributed DB (Cassandra / DynamoDB)
→ Redirect

Code (Cleaned & Correct)

Base62 Encoder

object Base62 {
private const val CHARS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
fun encode(num: Long): String {
var n = num
val sb = StringBuilder()
while (n > 0) {
sb.append(CHARS[(n % 62).toInt()])
n /= 62
}
return sb.reverse().toString()
}
fun decode(str: String): Long {
var result = 0L
for (c in str) {
result = result * 62 + CHARS.indexOf(c)
}
return result
}
}

URL Shortening Logic

class UrlShortenerService(
private val repo: UrlRepository,
private val cache: RedisClient
) {
suspend fun shorten(longUrl: String): String {
val id = repo.nextId() // Snowflake / DB sequence
val code = Base62.encode(id)
repo.store(id, longUrl)
cache.set(code, longUrl)
return "https://tiny/$code"
}
suspend fun expand(code: String): String? {
return cache.get(code)
?: repo.get(Base62.decode(code))?.also {
cache.set(code, it)
}
}
}

Storage Strategy

Database

  • Cassandra / DynamoDB
  • Partition key = ID or hash prefix
  • Write once, read many

Cache

  • Redis
  • 90%+ hit rate
  • TTL optional (URLs rarely change)

Scaling to 1B Requests / Day

  • CDN edge caching for redirects
  • Consistent hashing for Redis shards
  • Stateless API servers
  • Horizontal scaling

📊 Example numbers:

  • Redis hit: ~1 ms
  • DB hit: ~10–20 ms
  • CDN hit: ~<1 ms

Collision Handling

Avoid hashing collisions entirely by:

  • Using monotonic IDs (Snowflake / sequence)
  • Base62 is encoding, not hashing

If hashing is required:

  • Append counter or salt
  • Retry on collision (rare)

Architect One-Liner

I generate short URLs using Base62-encoded monotonic IDs, store mappings in a sharded, write-optimized datastore, fronted by Redis and CDN for ultra-low-latency redirects. The system is read-heavy, horizontally scalable, and collision-free by design.

77. Implement an LRU Cache with O(1) Operations

What the Interviewer Is Testing

They want to see if you can:

  • Maintain recency ordering
  • Achieve O(1) get and put
  • Manage pointer invariants safely
  • Avoid hidden O(n) work

The canonical solution is:

HashMap + Doubly Linked List

Core Design

  • HashMap<K, Node> → O(1) lookup
  • Doubly Linked List → O(1) remove + move-to-front
  • Sentinel head/tail nodes → simplify edge cases

Invariant

head <-> ... <-> most recent <-> tail
  • Head.next = Least Recently Used
  • Tail.prev = Most Recently Used

Correct, Production-Quality Implementation (Kotlin)

class LruCache(
private val capacity: Int
) {
private val map = HashMap<Int, Node>(capacity)
private val head = Node(0, 0) // LRU sentinel
private val tail = Node(0, 0) // MRU sentinel
init {
head.next = tail
tail.prev = head
}
fun get(key: Int): Int? {
val node = map[key] ?: return null
moveToTail(node)
return node.value
}
fun put(key: Int, value: Int) {
map[key]?.let { existing ->
existing.value = value
moveToTail(existing)
return
}
val node = Node(key, value)
map[key] = node
addToTail(node)
if (map.size > capacity) {
val lru = removeHead()
map.remove(lru.key)
}
}
// ---------- Linked List Operations ----------
private fun moveToTail(node: Node) {
remove(node)
addToTail(node)
}
private fun addToTail(node: Node) {
node.prev = tail.prev
node.next = tail
tail.prev!!.next = node
tail.prev = node
}
private fun removeHead(): Node {
val lru = head.next!!
remove(lru)
return lru
}
private fun remove(node: Node) {
node.prev!!.next = node.next
node.next!!.prev = node.prev
}
private data class Node(
val key: Int,
var value: Int,
var prev: Node? = null,
var next: Node? = null
)
}

Bugs in the Candidate Version (Good to Call Out)

  1. Capacity check off-by-one
if (size++ > capacity)

This allows capacity + 1 elements briefly.

  1. Unnecessary remove + reinsert on update
  • Updating value should not create a new node
  • Just move existing node to MRU

2. Manual size tracking

  • Redundant when map.size already exists
  • Source of subtle bugs

Time & Space Complexity

  • get: O(1)
  • put: O(1)
  • Space: O(capacity)

This holds even under heavy load (1⁰⁶+ ops/sec in JVM).

Thread Safety (If Asked)

This implementation is not thread-safe.

Options:

  • Synchronize public methods
  • Use ReentrantLock
  • Partition cache into segments
  • Or rely on higher-level synchronization

Production Alternative: LinkedHashMap

class LruCache<K, V>(capacity: Int) :
LinkedHashMap<K, V>(capacity, 0.75f, true) {
override fun removeEldestEntry(eldest: MutableMap.MutableEntry<K, V>): Boolean {
return size > capacity
}
}

Trade-offs

✅ Simple
 ❌ Less control
 ❌ Harder to extend (TTL, metrics, eviction hooks)

When This Comes Up in Android

  • In-memory image caches
  • API response caches
  • ViewModel-level memoization
  • Paging placeholders
  • Disk cache index

Interview-Ready Summary

I use a HashMap for constant-time lookup and a doubly linked list to track access order. Every access moves the node to the MRU position, and eviction removes the LRU node from the head. Sentinel nodes simplify edge cases, and all operations remain O(1).

78. Design Twitter Timeline (Home Feed) — Scale to 300M Users

Step 1: Clarify Requirements (Architect Move)

Functional

  • Post tweets
  • Read home timeline (latest tweets from people I follow)

Non-Functional

  • Read heavy (100× reads vs writes)
  • Low latency (<1s P99)
  • Highly available
  • Eventually consistent is acceptable
  • Must support celebrity accounts (millions of followers)

Step 2: Core Insight

There is no single strategy that works at Twitter scale.

We use a hybrid model:

  • Fan-out on write for normal users
  • Fan-out on read for hot users (celebrities)

This avoids:

  • Write explosion for hot users
  • Read-time slowness for normal users

Step 3: High-Level Architecture

Client
→ API Gateway
→ Timeline Service
→ Redis (home timelines)
→ Tweet Store (Cassandra)
→ Follow Graph (MySQL / Graph store)

Supporting systems:

  • Kafka (async fan-out)
  • CDN (media)
  • Cache warming jobs

Step 4: Data Model

Tweets (Write-Once, Read-Many)

Tweet {
tweetId (Snowflake)
userId
content
timestamp
}

Stored in Cassandra:

  • Partition key: userId
  • Clustering key: timestamp DESC

Home Timeline (Materialized View)

Redis Key: timeline:{userId}
Value: [tweetId1, tweetId2, ...]
TTL: few hours

Step 5: Write Path (Posting a Tweet)

Normal Users (Fan-out on Write)

suspend fun postTweet(userId: Long, tweet: Tweet) {
tweetStore.insert(tweet)

if (!isHotUser(userId)) {
kafka.publish("fanout", tweet)
}
}

Async Fan-out Worker

suspend fun fanOut(tweet: Tweet) {
followers(tweet.userId)
.chunked(1_000)
.forEach { batch ->
redis.pipeline {
batch.forEach { followerId ->
lpush("timeline:$followerId", tweet.tweetId)
}
}
}
}

Why Async?

  • Protects write latency
  • Smooths traffic spikes
  • Retryable & observable

Hot Users (Fan-out on Read)

For users with >10k–100k followers:

  • Do not push to follower timelines
  • Store only in tweet store

This avoids millions of Redis writes per tweet.

Step 6: Read Path (Fetching Timeline)

suspend fun getTimeline(userId: Long): List<Tweet> {
val cachedIds =
redis.lrange("timeline:$userId", 0, 50)

val tweets =
tweetStore.getTweets(cachedIds)
val missing = 50 - tweets.size
if (missing > 0) {
val hotTweets =
pullFromHotUsers(userId, missing)
return mergeAndSort(tweets, hotTweets)
}
return tweets
}

What Happens Here

  • Fast path: Redis hit (sub-10 ms)
  • Slow path: Merge in tweets from hot users
  • Sorted by timestamp before returning

Step 7: Follow Graph

  • Stored separately (MySQL / graph DB)
  • Cached aggressively
  • Read-heavy
followers(userId)
following(userId)

Step 8: Scaling to 300M Users

Key Techniques

Step 9: Performance Numbers (Interview Gold)

  • Redis hit: ~5–10 ms
  • Cassandra read: ~15–30 ms
  • Kafka fan-out: async, near-real-time
  • P99 home feed: <1 second
  • Celebrity tweet fan-out avoided: millions of writes saved

Step 10: Consistency Model

  • Eventually consistent
  • Missing tweets acceptable briefly
  • Ordering is “mostly correct” (timestamp-based)

This is explicitly acceptable for social feeds.

Step 11: Failure Handling

  • Redis miss → regenerate from tweet store
  • Kafka retry → at-least-once fan-out
  • Cache rebuild jobs

Interview One-Liner Summary

At Twitter scale, the home timeline uses a hybrid model: fan-out on write for normal users and fan-out on read for celebrities. Tweets are stored once, timelines are materialized in Redis, and async pipelines smooth spikes. This keeps write costs bounded while delivering sub-second read latency to hundreds of millions of users.

79. Rate Limiter — Token Bucket & Sliding Window

Step 1: Requirements Clarification

  • Limit requests per user/API key/IP
  • High concurrency (~1000 req/s per key)
  • Global consistency (multi-node)
  • Optionally bursty traffic
  • Low latency (<1 ms check)

Common algorithms:

  1. Token Bucket — allows bursts
  2. Leaky Bucket — smooths traffic
  3. Fixed Window / Sliding Window Log / Sliding Window Counter — precise per-window

Step 2: Token Bucket (Production-Grade)

Principle

  • Each key has:
  • tokens (available requests)
  • lastRefill timestamp
  • Refill tokens at fixed rate
  • Allow request if tokens >= 1

Lua Script for Atomicity (Redis)

-- KEYS[1] = key prefix
-- ARGV[1] = capacity
-- ARGV[2] = refill_rate
-- ARGV[3] = now (ms)

local tokens_key = KEYS[1] .. ":tokens"
local ts_key = KEYS[1] .. ":last"
local capacity = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local last = tonumber(redis.call("GET", ts_key) or now)
local tokens = tonumber(redis.call("GET", tokens_key) or capacity)
local delta = (now - last) / 1000.0 * rate
tokens = math.min(capacity, tokens + delta)
local allowed = 0
if tokens >= 1 then
tokens = tokens - 1
allowed = 1
end
redis.call("SET", tokens_key, tokens)
redis.call("SET", ts_key, now)
return allowed

✅ Guarantees atomic updates in Redis under high concurrency.

Kotlin Wrapper

class RateLimiter(
private val capacity: Int,
private val refillPerSecond: Int,
private val redis: RedisClient
) {
suspend fun isAllowed(key: String): Boolean {
val now = System.currentTimeMillis()
val result = redis.eval<Long>(
tokenBucketLua,
keys = listOf("rate:$key"),
args = listOf(capacity.toString(), refillPerSecond.toString(), now.toString())
)
return result == 1L
}
}

Performance

  • ~100k checks/sec per Redis node
  • Handles bursts automatically up to capacity

Step 3: Sliding Window Counter (Precise per interval)

Principle

  • Avoid “spiky bursts at window boundaries” of fixed window
  • Track multiple sub-windows in Redis
  • Count requests per sub-window, sum last N windows

Implementation Sketch

suspend fun isAllowedSliding(key: String, limit: Int, windowMs: Long): Boolean {
val now = System.currentTimeMillis()
val bucket = now / 1000 // 1-second sub-buckets
val redisKey = "sw:$key:$bucket"

val count = redis.incr(redisKey)
redis.expire(redisKey, (windowMs / 1000).toInt())

val total = redis.keys("sw:$key:*")
.map { redis.get(it).toInt() }
.sum()

return total <= limit
}

Notes

  • More precise than fixed window
  • Slightly more Redis overhead (keys per sub-window)
  • Can also use Lua + sorted set to avoid scanning keys

Step 4: Comparison — Token Bucket vs Sliding Window

Step 5: Production Best Practices

  1. Atomic Redis update — Lua is mandatory under concurrency.
  2. Sharding by key — for millions of users.
  3. TTL cleanup — prevent memory leak from keys.
  4. Client-side fallback — optional local token bucket to reduce Redis load.
  5. Monitoring — track rejected requests, burst patterns.

Step 6: Staff Architect Talking Points

For per-user API limits, I prefer a token bucket with atomic Redis/Lua scripts for high concurrency. For stricter per-second windows, I use sliding window counters with sub-buckets. The choice depends on whether bursts are acceptable. At production scale, sharding and TTL cleanup are essential, and Redis atomicity ensures correctness under race conditions.

80. Consistent Hashing for Sharding

Step 1: Problem

  • You have N servers / shards
  • You want to map keys → servers deterministically
  • Minimize key remapping when servers are added/removed
  • Support hot partitions, e.g., in Redis or cache clusters

Naive approach:

server = hash(key) % N

❌ Problem: full remap on N change

Step 2: Core Idea — Consistent Hash Ring

  1. Map servers to a ring (0 → ²³²-1)
  2. Map keys to the ring
  3. Pick the next clockwise server
  4. Add/remove server → only keys between old/new neighbors are remapped

Virtual Nodes (Replicas)

  • Avoid uneven load (hot spots)
  • Each physical server → M virtual nodes on the ring
  • Hash each virtual node separately

Step 3: Kotlin Implementation

class ConsistentHashRing(
private val replicas: Int = 100
) {
private val ring = sortedMapOf<Int, String>() // Hash -> server
fun addServer(server: String) {
repeat(replicas) { i ->
val hash = hash("$server#$i")
ring[hash] = server
}
}
fun removeServer(server: String) {
repeat(replicas) { i ->
val hash = hash("$server#$i")
ring.remove(hash)
}
}
fun getServer(key: String): String {
if (ring.isEmpty()) throw IllegalStateException("No servers in ring")
val hash = hash(key)
val tailMap = ring.tailMap(hash)
return if (tailMap.isNotEmpty()) {
tailMap.values.first()
} else {
ring.values.first() // Wrap around
}
}
private fun hash(str: String): Int =
str.hashCode() and 0x7fffffff // positive int
}

Step 4: Why Virtual Nodes Matter

  • Without replicas: one server can get all keys near its hash
  • With 100–200 replicas per server: key distribution is uniform
  • When adding a server:
  • Only keys in ranges assigned to new virtual nodes move
  • Typically ~1/N of keys remap

Step 5: Scaling Notes

  • Hash function: use MurmurHash / CRC32 for uniform distribution
  • Ring can be implemented as:
  • TreeMap / SortedMap → O(log N) lookup
  • Binary search over sorted array → faster for very large clusters
  • Redis Cluster, Cassandra, and Akamai edge caching use this pattern

Step 6: Handling Hot Partitions

  • Assign more virtual nodes to small servers → weighted distribution
  • Rebalance rarely, use asynchronous migration
  • Optional: consistent hashing + jump consistent hash for ultra-low remap

Step 7: Summary / Interview Talking Points

Consistent hashing allows adding/removing nodes with minimal key remapping. Virtual nodes smooth out hot spots. Production clusters like Redis or Cassandra use this to ensure that ~1/N keys are remapped when a new node joins, avoiding full re-sharding.

Comments

Featured Articles

Optimize Jetpack Compose: Performance & Best Practices

From ‘Master’ to ‘Main’: The Meaning Behind Git’s Naming Shift

JIT vs AOT Compilation | Android Runtime

Play Store Uploads with Fastlane Supply - 4

Managing App Versioning and Changelogs- 6

Mastering Android App Performance: Expert Insights