Mid questions
Every rated question at this level, grouped by topic and tagged with the company it came up at.
CI/CD 9 questions
- Automated GitOps Promotion
More
Scenario
You are setting up a GitOps pipeline.
repo-acontains the application source code, andrepo-bcontains the Kubernetes configuration (values.yaml).When code is pushed to
repo-a, the pipeline must:- Build a Docker image tagged with the Short Git SHA.
- Automatically update
repo-bto use this new image tag.
Task
Edit the existing workflow file at
/home/interview/repo-a/.github/workflows/promote.ymlto complete the pipeline.- Calculate SHA: Finish the step to extract the first 7 characters of
$GITHUB_SHAinto an environment variable namedSHORT_SHA. - Build: Build a Docker image named
apptagged with${{ env.SHORT_SHA }}. - Checkout Infrastructure: Configure the
checkoutstep to cloneinterview/repo-binto a directory namedinfra. - Promote: Inside the
infradirectory:- Update
values.yamlso that thetagkey reflects the newSHORT_SHA. - Commit the change with the message:
Update tag to <SHORT_SHA>.
- Update
- Automated Rollback on Deployment Failure with Values File Restoration
More
Scenario
A repository at
/home/interview/deploy-repocontains avalues.yamlfile tracking the current Docker image tag and adeploy.shscript that simulates production deployment. When deployments fail, the team must manually revertvalues.yamlto the previous working tag. An automated rollback mechanism is needed to detect failures and restore the last known-good tag with a commit by CI Bot. A starter workflow file has been created at.github/workflows/deploy.ymlwith the basic structure.Task
Navigate to
/home/interview/deploy-repoand complete the GitHub Actions workflow at.github/workflows/deploy.ymlthat triggers on push to main branch with two jobs: a deploy job that runs./deploy.sh, and a rollback job that depends on the deploy job, runs only when deployment fails, restoresvalues.yamlto its state from the previous commit, and commits the change with author "CI Bot" [email protected] and messagechore: automatic rollback due to deployment failure. Test with:./github-ci push - Docker Image Tagging with Commit SHA
More
Scenario
A repository at
/home/interview/repocontains a Dockerfile but no automated build process. Developers manually build Docker images without consistent tagging, making it difficult to track which image corresponds to which code version. A starter workflow file has been created at.github/workflows/build.ymlwith the basic structure.Task
Navigate to
/home/interview/repoand complete a GitHub Actions workflow at.github/workflows/build.ymlthat automatically builds a Docker image namedappand tags it with the short commit SHA on every push to the main branch. The pipeline should ensure each commit produces a uniquely identifiable image. CI Pipeline can be executed and tested with./github-ci pushcommand. - GitHub Actions Matrix Build Strategy
More
Scenario
A repository at
/home/interview/repocontains a Node.js application with a test suite at./run-tests.sh. The application needs to be tested across multiple Node.js versions (18, 20, and 22) to ensure compatibility, but currently has no automated testing configured. A starter workflow file has been created at.github/workflows/pr-tests.ymlwith the basic structure. The repository has the upload-artifact action available locally at.github/actions/upload-artifactfor artifact management.Task
Navigate to
/home/interview/repoand complete the GitHub Actions workflow at.github/workflows/pr-tests.ymlthat runs the test suite across Node.js versions 18, 20, and 22 using a matrix strategy. Each matrix job should create anode-version.txtartifact containing its Node.js version number. The workflow should execute on pull requests. The workflow can be tested with./github-ci pull_requestand artifacts will be available in/tmp/github-artifacts/1/for verification.Important: Use container images (
node:${{ matrix.node-version }}-slim) instead of setup-node actions to configure Node.js versions. - Job Dependency Enforcement
More
Scenario
A repository at
/home/interview/repocontains a workflow file.github/workflows/pipeline.yml. The workflow currently defines three jobs:lint,test, andbuild. These jobs are configured to run in parallel without any dependencies.Task
Update the pipeline to enforce a strict execution sequence. Configure the
testjob to depend on the successful completion oflint. Configure thebuildjob to depend on the successful completion oftest. The final execution order must be Lint → Test → Build. The pipeline execution can be verified with./github-ci push. - Multi-Job Workflow with Artifact Handoff
More
Scenario
A repository at
/home/interview/repocontains a Node.js application with a test suite at./run-tests.sh. The development team needs a two-stage CI/CD pipeline where the first job runs tests and generates test results, and a second job downloads those results to create a summary report. The repository has upload-artifact and download-artifact actions available locally at.github/actions/for artifact management. A starter workflow file has been created at.github/workflows/artifact-handoff.ymlwith the basic structure.Task
Navigate to
/home/interview/repoand complete the GitHub Actions workflow at.github/workflows/artifact-handoff.ymlthat implements a two-job pipeline triggered onpull_requestevents using container images (node:20-slim):- Job A (test-job): Run the test suite and upload the test-results.txt file as an artifact named "test-results" using
./.github/actions/upload-artifact - Job B (report-job): Download the test-results artifact using
./.github/actions/download-artifact, verify it exists, count passed tests withgrep -c "PASS:", save the count tosummary.txtin format "Total Passed Tests: X", and upload as "test-summary" artifact. Job B must depend on Job A using the "needs" keyword.
The workflow can be tested with:
./github-ci pull_request - Job A (test-job): Run the test suite and upload the test-results.txt file as an artifact named "test-results" using
- Path-Based Workflow Execution
More
Scenario
A repository at
/home/interview/repocontains a multi-component application with infrastructure code in the/infradirectory and documentation in the/docsdirectory. Currently, the repository has no automated workflow configured with path-based filtering to run workflows only when specific files change. A starter workflow file has been created at.github/workflows/infra-check.ymlwith the basic structure. The repository has the upload-artifact action available locally at.github/actions/upload-artifactfor artifact management.Task
Navigate to
/home/interview/repoand complete the GitHub Actions workflow at.github/workflows/infra-check.ymlthat runs only when files under the/infradirectory change. The workflow should trigger on push events, use container image (node:20-slim), run a validation script at./validate-infra.sh, and create an artifact named "validation-report" containingvalidation-result.txtusing./.github/actions/upload-artifact. The workflow can be tested with./github-ci pushand artifacts will be available in/tmp/github-artifacts/1/for verification. - PR Test Gate
More
Scenario
A repository at
/home/interview/repohas a test suite located at./run-tests.shbut no automated testing on pull requests. Developers must manually run tests before merging, leading to inconsistent test coverage and occasional bugs in the main branch. A starter workflow file has been created at.github/workflows/pr-tests.ymlwith the basic structure. The repository already has the upload-artifact action available locally at.github/actions/upload-artifactfor artifact management.Task
Navigate to
/home/interview/repoand complete the GitHub Actions workflow at.github/workflows/pr-tests.ymlthat triggers on pull request events, runs the test suite./run-tests.sh, and uploads the test results fromtest-results.txtas an artifact namedtest-resultsusing./.github/actions/upload-artifact. The artifact upload should run even if tests fail usingif: always(). Test with:./github-ci pull_requestand artifacts will be available in/tmp/github-artifacts/1/for verification. - Reusable Workflow with Input Parameters
More
Scenario
A repository at
/home/interview/repocontains multiple applications that share common build and test steps. The development team wants to eliminate code duplication across workflows by creating a centralized reusable workflow that can be called from different workflows with different parameters.Task
Create two GitHub Actions workflows: a reusable workflow at
.github/workflows/shared-build.ymlthat accepts an input parameterapp-name, runs a build script./build.shwith the app name, and uploads a build artifact named"build-{app-name}"; and a caller workflow at.github/workflows/deploy.ymlthat triggers on push events, calls the reusable workflow withapp-name: "frontend", and uses container image (node:20-slim). You can test with:./github-ci push
Cloud 1 question
- Design Egress Only VPC with NAT
More
Scenario
We need to prepare infrastructure for
ECStasks andEC2instances. It has to span across at least two Availability Zones. These workloads require outbound internet access to download updates and call external APIs. However, inbound access is not allowed. Additionally application should send data toS3in cost effective way and so we have to deploy necessary infrastructure for that traffic too.Note: You are required to design the VPC networking architecture only. Creation of ECS clusters, services, or EC2 instances is not part of this task.
Task
Design and implement a VPC network architecture
10.0.0.0/16that meets the following requirements:- Network Isolation: Workloads must reside in subnets across two Availability Zones with no public IP addresses and must not be directly reachable from the internet.
- Egress Control: Workloads must be able to initiate outbound connections to the public internet over HTTP and HTTPS. The internet must not be able to initiate connections to them. Note: You should create and configure a new non-default security group.
- Egress Restrictions: Outbound traffic from workloads should be limited to only the required protocols and destinations.
- Cost Awareness: The architecture should account for cost-efficient routing when accessing AWS-managed services, minimizing unnecessary NAT Gateway usage.
Note: You can use either the AWS Management Console or AWS CLI to complete this task.
Containers 5 questions
- Graceful Shutdown with SIGTERM Handling
More
Scenario:
You have a container image
myapp:gracebuilt from/home/interview/Dockerfilethat runs an application requiring cleanup on shutdown. When you rundocker stop, the container is killed after the 10-second timeout instead of shutting down gracefully because the application doesn't properly handle the SIGTERM signal.Task:
Fix the application script and Dockerfile to properly handle SIGTERM signals, implement cleanup logic, and ensure the container exits gracefully within 20 seconds when
docker stopis executed.
- Insecure Container Root User
More
Scenario
A Dockerfile at
/home/interview/Dockerfilebuilds a Python web application tagged asmyapp:secure. The container runs as the root user (UID 0) with extensive Linux capabilities, violating the principle of least privilege.Task
Harden the Dockerfile by creating a non-root user
appuserwith UID 10001, switching to that user for application execution, ensuring application files have appropriate permissions, rebuilding the image with the tagmyapp:secure, and verifying the containerrunswith reduced privileges while maintaining full functionality.Example
# Before (running as root) uid=0(root) gid=0(root) groups=0(root)# After (running as non-root user) uid=10001(appuser) gid=10001(appuser) groups=10001(appuser) Reduced capability set present Application responds with non-root user confirmationcurl http://localhost:5000/healthResponse: {"status":"healthy","uid":10001,"user":"appuser"}
- Memory Limit and OOM Killer
More
Scenario
A container named
mem_testrunning from imagemyapp:memcontains a memory stress script at/app/stress_memory.sh. Currently, the container runs with unbounded memory and can consume as much RAM as available on the host, preventing the OOM (Out of Memory) killer from terminating it even when it allocates excessive memory.Task
Apply
100MBmemory limits to the container so that running the provided stress script causes the container to be OOM-killed. Stop the container, restart it with memory limits applied using the appropriate flags, run the stress script, and verify the container gets OOM-killed when the memory limit is exceeded by checking the OOM kill status._Note: The container
mem_testand imagemyapp:memwith the stress script are already available.Example
# Before (no memory limits) $ docker inspect mem_test --format 'Memory: {{.HostConfig.Memory}}' Memory: 0 $ docker inspect mem_test --format 'OOMKilled: {{.State.OOMKilled}}' OOMKilled: false# After (memory limits applied and stress script triggers OOM) $ docker inspect mem_test --format 'Memory: {{.HostConfig.Memory}}' Memory: 104857600 $ docker inspect mem_test --format 'OOMKilled: {{.State.OOMKilled}}' OOMKilled: true $ docker inspect mem_test --format 'ExitCode: {{.State.ExitCode}}' ExitCode: 137
- Optimize Dockerfile
More
Scenario
A Dockerfile at
/home/interview/Dockerfilesuccessfully builds a Go application, but the resulting Docker image is over 800MB in size due to including the full Go toolchain, build dependencies, and source code. Production images must be under 200MB.Task
Rewrite the Dockerfile using a multi-stage build pattern to separate the build environment from the runtime environment. Use
Alpineas the base image for the final runtime stage, ensure only the compiled binary and necessary runtime dependencies are copied to the final stage, build the optimized image with the tagmyapp:fixed, and verify the final image size is below 200MB while maintaining full functionality.Example
# Before (bloated image) REPOSITORY TAG SIZE myapp original 550MB# After (optimized multi-stage build) REPOSITORY TAG SIZE myapp fixed 45.3MB - Storage Driver Performance Fuse Overlayfs
More
Scenario:
You have a container image
myapp:fsbuilt from/home/interview/Dockerfilethat performs intensive filesystem write operations. Docker is currently using the overlayfs storage driver and write performance is a bottleneck.Task:
Configure Docker deamon
/etc/docker/daemon.jsonto use appropriate configuration for write heavy setup. Restart the Docker daemon, and rebuild themyapp:fsimage for validation testing.
Git 9 questions
- Create an Annotated Tag
More
Scenario:
You have a Git repository at
/home/interview/repowhere you've just completed version 3.1.0 of your application. You need to create an annotated tag to mark this release with a proper message and metadata.Task:
Create an annotated tag for version v3.1.0 with a descriptive message and push it to the remote
originrepositoryv3.1.0
- Fix Repository with Unrelated Histories
More
Scenario:
The repository at
/home/interview/repois in a broken state. Local and remote branches have diverged with no common ancestor. Consequently,git push origin mainfails with a non-fast-forward error, andgit pull origin mainfails because the histories are unrelated.Task:
Navigate to
/home/interview/repo. Merge and linearize the unrelated histories (using rebase) to create a single commit sequence.Example:
# Before (Broken) $ git pull origin main fatal: refusing to merge unrelated histories # After (Fixed - Linear History) $ git pull origin main Already up to date. $ git log --oneline --decorate e8f9g0h (HEAD -> main, origin/main) Add local feature B d7e8f9g Add local feature A b5c6d7e Add remote config a4b5c6d Remote initial commit
- Merge Repositories Preserving Both Histories
More
Scenario:
You have two separate Git repositories at
/home/interview/repo-a(5 commits) and/home/interview/repo-b(4 commits) developed independently. Create a new monorepo at/home/interview/monorepothat combines both repositories into separate subdirectories using subtree (project-a/andproject-b/) while preserving the full commit history from both repositories.Example:
# Before (two separate repositories) $ cd /home/interview/repo-a && git log --oneline | wc -l 5 $ cd /home/interview/repo-b && git log --oneline | wc -l 4# After (combined monorepo with both histories) $ cd /home/interview/monorepo && ls -la project-a/ project-b/ .git/ $ git log --all --oneline | wc -l 12
- Recover Lost Commits from Detached Head
More
Scenario:
You have a Git repository at
/home/interview/repowhere you were in a detached HEAD state, made 3 commits, then switched back to themainbranch. Those 3 commits are now unreachable and appear to be lost since no branch references them.Task:
Navigate to the repository at
/home/interview/repo, check the reflog to locate the lost commits from the detached HEAD state, create a new branch calledrecovered-workpointing to those commits.Example:
# Before (commits lost, only main visible) $ git log --oneline -3 c3d4e5f Main branch work b2c3d4e Second commit a1b2c3d Initial commit $ git branch * main# After (commits recovered in new branch) $ git branch * main recovered-work $ git log recovered-work --oneline -3 f6a7b8c Third detached commit e5f6a7b Second detached commit d4e5f6a First detached commit
- Remove File from Entire Git History
More
Scenario
A file named
secrets.envcontaining sensitive credentials exists in the repository's commit history. You need to purge this file from the entire history.Task
Navigate to
/home/interview/repoand rewrite the commit history to permanently removesecrets.envfrom all commits.Example
# Before (Sensitive data in history) $ git log --all --oneline -- secrets.env a1b2c3d Add secrets file # After (Clean history) $ git log --all --oneline -- secrets.env (no output)
- Restore File to Previous Version
More
Scenario:
You have a Git repository at
/home/interview/repowhere theconfig.jsfile has been modified in the last two commits, but those changes introduced bugs. You need to restore onlyconfig.jsto the version it had 2 commits ago without affecting any other filesTask:
Restore
config.jsto its state from 2 commits ago, stage and commit this change withRestore config.jsmessage.
- Stash Work Fix Bug Restore and Update
More
Scenario
Uncommitted changes on
feature-uiprevent you from switching branches to fix a critical bug onmain.Task
Stash local changes to clean the working directory with message
WIP: UI improvements. Switch to a new branchhotfix-authto implement the fix, then merge it intomain. Finally, rebasefeature-uiagainst the updatedmainand restore your stashed changes.Changes in hotfix-auth branch are below:
echo "Fixing critical bug" >> src/auth.js git commit -m "Fix critical authentication bug"Example
# Before (Switch blocked) error: Your local changes to the following files would be overwritten by checkout: src/ui.js Please commit your changes or stash them before you switch branches. # After (Updated and Restored) $ git log --oneline -1 main a1b2c3d Fix critical authentication bug $ git status On branch feature-ui Changes not staged for commit: modified: src/ui.js
- Stash Work Fix Bug Resume
More
Scenario
You have a Git repository at
/home/interview/repowhere you are working on a new feature on thefeature-loginbranch with uncommitted changes inlogin.js. A fix is needed on thedevbranch:app.jscontains a JavaScript syntax error that can be identified by runningnode app.js.Task
Stash your current changes to clean the working directory, switch to the
devbranch, fix the syntax error inapp.js, commit the fix using the commit messageFix syntax error in app.js, then return to yourfeature-loginbranch, merge the fix fromdev, and restore your stashed work. - Update Submodule to Latest Commit
More
Scenario:
You have a Git repository at
/home/interview/repothat contains a submodule in thevendor/utilsdirectory. The submodule is pointing to an old commit, but newer commits exist on the submodule's remote repository.Task:
Update the submodule to the latest commit on its default branch and commit this change in the parent repository.
Kubernetes 13 questions
- ConfigMap Reload With Sidecar
More
Scenario
Your application needs to react to configuration changes without requiring a pod restart.
Task
Create a ConfigMap and a pod with a sidecar that watches for configuration file changes. Mount the ConfigMap to both containers.
Property Value Namespace defaultConfigMap app-configConfigMap key settings.confConfigMap value debug=falsePod app-podMain container appMain image nginx:1.24Sidecar container config-watcherSidecar image busyboxMount path /etc/configSidecar container snippet is at
/home/interview/watcher.yaml.
- Crashing Misconfigured Pod
More
Scenario
You have a Kubernetes cluster where the deployment webapp in namespace prod is stuck in
CrashLoopBackOff.The application exposes a health check endpoint at
/healthzon port8080. The container image already includes a config directory with the required files, but the deployment configuration has issues preventing the pod from starting correctly.Because the config directory already exists and is used by the application, mounting a
ConfigMapover the entire directory would overwrite it and cause the app to fail. The deployment needs to be fixed so configuration is injected without replacing the existing directory.Task
Fix the pod so it reaches Running state with 1/1 Ready
- CronJob Schedule Misconfiguration
More
Scenario
A CronJob named
cleanupin theopsnamespace is failing to trigger as expected. It has an incorrect schedule, relies on the default timezone (which may not match the server), and retains too many completed jobs, cluttering the history.Task
Fix the cleanup
CronJobso that validation confirms it triggers exactly once per minute. Update the schedule to* * * * *, set the timezone toEtc/UTC, and ensure only the most recent successful run is retained.Example
Current Status (Failing):
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE cleanup 0 0 1 1 * False 0 <none> 5mTarget Status (Success):
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE cleanup * * * * * False 0 10s 7m
- Custom Resource Definition Setup
More
Scenario
You need to extend the Kubernetes API to support a proprietary resource type called "Widget".
Task
Create a CustomResourceDefinition named
widgets.mycompany.io. Define the group asmycompany.ioand the kind asWidgetshort namewdand withserved:true storage:trueparameters. Specify the scope asNamespacedand define a version namedv1. Create a Custom Resource instance of this type namedsample-widgetin theextensionsnamespace. Verify that the API accepts the new resource and that you can list it.
- DNS Based Service Discovery
More
Scenario
In namespace
disco, the applicationdiscovery-apprelies on DNS-based discovery of peer pods. The servicediscovery-svcis supposed to return all backend pod IPs, but currently:kubectl exec deploy/testpod -n disco -- nslookup discovery-svc.disco.svc.cluster.localreturns only one IP instead of individual pod IPs. The application cannot discover its peers and fails to form a cluster.
Task
Fix the service configuration so that DNS resolution of
discovery-svcreturns one IP per pod, allowing the application to discover all running instances.Property Value Namespace discoDeployment discovery-app(3 replicas)Service discovery-svcPod label app=discovery - Fix Job With ServiceAccount and RBAC Permission Issues
More
Scenario
You have a Kubernetes cluster where a job named
data-loaderin namespaceopsfails with permission denied when calling the Kubernetes API. The job manifest file is located at/home/interview/data-loader-job.yaml.Task
Fix the RBAC configuration and job specification so the job can successfully access the Kubernetes API to get and list pods.
Example
# Before (job fails) $ kubectl get jobs -n ops NAME COMPLETIONS DURATION AGE data-loader 0/1 2m 2m# After (job completes) $ kubectl get jobs -n ops NAME COMPLETIONS DURATION AGE data-loader 1/1 15s 30s - Image Pull Backoff Secrets
More
Scenario
A Deployment named
backendin thedevnamespace is failing to start. Pods are stuck inImagePullBackOffdue to configuration issues.Task
Fix the Deployment so pods successfully pull the image and enter the
Runningstate.Image Information
Property Value Registry ghcr.io Repository prepare-sh/alpine Version 3.23.2 Architecture amd64 Size 8.44 MB Username preparesh-bot Access Token <provided-in-terminal>
- Implement StatefulSet With Stable DNS
More
Scenario
You are deploying a distributed database named
dns-appin thedevnamespace. This application requires that each Pod be addressable by a predictable, unchanging hostname (e.g.,dns-app-0.dns-app) so the nodes can find each other for data replication.Task
Create a Headless Service named dns-app in the dev namespace. Create a StatefulSet named dns-app with 3 replicas
image nginx:alpine. Ensure the StatefulSet uses the dns-app service for network identity. Verify that pods have stable DNS names like dns-app-0.dns-appUse
kubectl exec -it netshoot -n dev -- nslookup dns-app-0.dns-appto validate resolution.
- Multi Tenant Namespace Isolation
More
Scenario
Two teams share a cluster and require strict isolation with specific exceptions for inter-team communication.
Task
Configure network isolation and resource constraints for both teams:
- Create default deny NetworkPolicies for both namespaces (deny all ingress and egress traffic)
- Create a NetworkPolicy allowing
team-apods to accessteam-bpods labeledapp=apion port8080only - Create LimitRanges in both namespaces to enforce maximum resource limits per container
Property Value Namespace 1 team-aNamespace 2 team-bAllowed communication team-a→team-bpods with labelapp=apion port8080onlyDefault traffic Deny all other cross-namespace traffic Max CPU per container 1Max Memory per container 512MiNote: Test pods are already deployed -
clientinteam-a, andapi+webinteam-b. - OOMKilled Pod Analysis Fix
More
Scenario
A memory-intensive application named
oom-demois repeatedly crashing. The pod status showsCrashLoopBackOff, but you need to confirm the underlying cause is an "Out Of Memory" (OOM) error and fix it by increasing the memory limit.Task
Inspect the existing pod
oom-demoin theappsnamespace. Confirm the termination reason isOOMKilled. Update the pod definition to increase the memory limit from20Mito100Mi. Verify the pod stabilizes and enters theRunningstate.
- Secure Internal Service Communication
More
Scenario
An application requires TLS certificates for internal service communication.
Task
Setup cert-manager to issue a valid TLS certificate using a SelfSigned ClusterIssuer to bootstrap a CA Issuer.
Property Value Namespace prepareshSelfSigned ClusterIssuer selfsigned-issuerCA Certificate name ca-certCA secret name ca-secretCA Issuer name ca-issuerCA Issuer CN preparesh-caCertificate name web-certCertificate secret web-cert-tlsDNS names web.preparesh.svc,web.preparesh.svc.cluster.localTemplate files available at
/home/interview/.
- StorageClass and PVC Expansion
More
Scenario
The cluster has no StorageClass configured for dynamic volume expansion.
Task
Create a StorageClass with volume expansion enabled, create a PVC and Pod using it, then expand the PVC from
1Gito2Gi.Property Value Namespace storageStorageClass name fast-scPVC name expand-pvcInitial PVC size 1GiExpanded PVC size 2GiPod name storage-podPod image nginxMount path /dataResource templates are available at
/home/interview/.
- Traffic Splitting With Native Kubernetes
More
Scenario
Your team has a stable deployment
app-v1running in namespacecanary. They want to test a new version (v2) with approximately 1/3 of traffic going to v2 and 2/3 remaining on v1.Task
Implement canary traffic splitting between both versions using only native Kubernetes resources.
Property Value Namespace canaryExisting deployment app-v1Canary deployment app-v2Canary image nginx:1.25Service name my-app-svcService port 80A reference deployment file is available at
/home/interview/deployment.yaml.
Linux 10 questions
- Analyzing Log Partition Usage
More
Scenario
Log rotation has stopped working correctly, and you suspect that
/var/logmight be mounted on a different filesystem with limited space or incorrect mount options.Task
Determine which filesystem or device
/var/logis mounted on, including device name, mount point, filesystem type, size, and usage. Save the findings to/home/devops/varlog_filesystem_info.txtOptionally identify if this filesystem differs from/which could provide additional info on causes of log rotation or space issues.Example
# After (filesystem identified and analyzed) Filesystem Size Used Avail Use% Mounted on /dev/sdb1 10G 9.8G 200M 98% /var/log Mount details: /dev/sdb1 on /var/log type ext4 (rw,relatime)
- Debug SSH Lockout
More
Scenario
The developer account dev has been locked out of the server. Security logs indicate the SSH daemon's authentication failure limit was triggered.
Task
Check the logs to count exactly how many times the user failed
today. Update the SSH configuration to increase the allowed login attemptsabove that number.Example
root@server:~# tail /var/log/auth.log Dec 23 09:12:01 server sshd[2201]: Failed password for dev from 10.0.0.5 port 5432 ssh2 Dec 23 09:12:05 server sshd[2201]: Failed password for dev from 10.0.0.5 port 5432 ssh2 Dec 23 09:12:08 server sshd[2201]: Failed password for dev from 10.0.0.5 port 5432 ssh2
- Detect Memory Leak by Monitoring RSS
More
Scenario
One of your long-running Node.Js services (process name
node) has been slowing down over several hours of uptime. CPU usage is normal, disk I/O is normal, but the server is gradually running out of memory.Task
Identify if any process is leaking memory and kill that process.
- Discover Unexpected Background Jobs
More
Scenario
You have noticed an unexpected spike in system load. You suspect a batch of recently spawned jobs is responsible and need to isolate processes that started within the last few minutes.
Task
Identify all processes started within the last 10 minutes and save their PID, User, Start Date (
lstart), and Command to/home/devops/recent_processes.txt. Once recent processes written to the file Terminate the Suspicious processes (i.e any process that doesn't belong to the root user or systemd).Example
The file
/home/devops/recent_processes.txtshould contain the list of recently started processes:PID USER STARTED CMD 8234 deploy Mon Oct 29 16:37:12 2025 /opt/scripts/deploy.sh 8235 deploy Mon Oct 29 16:37:15 2025 bash ./worker_start.sh - Fix Inode Exhaustion Issue
More
Scenario:
Your server cannot create new files. Commands like
touchfail with "No space left on device" errors, butdf -hshows plenty of free disk space. The filesystem has exhausted available inodes.Task:
Save inode usage to
/home/interview/inode_usage.txt, find which directory contains excessive files, save the problematic directory path to/home/interview/problem_directory.txt, clean up the files, and verify the fix.Example
File: /home/interview/inode_usage.txt
Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 6553600 0 100% / /dev/sdb1 3276800 125000 3151800 4% /dataFile: /home/interview/problem_directory.txt
/var/spool/postfix/maildrop
- Monitoring Process Ownership
More
Scenario
The server is consuming excessive resources. This server is used by multiple teams with their own credentials (e.g. each team has a username
dev-team,qa-team,ops-teametc).Task
Identify which user is running the most number of processes (count) on the server, regardless of CPU or memory usage, and write that username to:
/home/devops/solution.txtExample
A single username written to the expected output file.
<username>
- Real-Time Log Timestamping
More
Scenario
You're troubleshooting a service that produces untagged log output when run manually, making it difficult to analyze timing and sequence of events.
Task
Create a command that reads from standard input line by line and appends the current timestamp to the end of each line as it's read. Test it interactively by piping output to verify it works, then save the solution to a shell script at
/usr/local/bin/timestamp.shand make it executable so it can be used in any pipeline.Example
# Before (untagged log output) Application started Processing request #1234 Database connection established Request completed# After (timestamped in real-time) Application started - 2025-11-06 15:30:45 Processing request #1234 - 2025-11-06 15:30:46 Database connection established - 2025-11-06 15:30:47 Request completed - 2025-11-06 15:30:48
- Update AWS Configs
More
Scenario
Each application environment (staging, dev, prod) has its own configuration file stored under
/etc/app/envs/, and each file currently hasmulti_az = falseandavailability_zone = "us-east-1a". Manually editing each file is error-prone and inefficient, so the change must be automated.Task
Locate all
.conffiles under/etc/app/envs/across different environment subdirectories, update themulti_azsetting fromfalsetotrue, modify theavailability_zoneline to include two zones"us-east-1a,us-east-1b", perform these edits in-place while preserving all other configuration values.Example
# Before (single-AZ configuration) region = "us-east-1" availability_zone = "us-east-1a" multi_az = false# After (multi-AZ configuration enabled) region = "us-east-1" availability_zone = "us-east-1a,us-east-1b" multi_az = true
- Upload Safe File Partitioning
More
Scenario
Your application uploads files from
/tmp/app/, but the maximum allowed file size is 1 MB, and some files exceed this limit.Task
Find all files larger than 1 MB in
/tmp/app/and its subdirectories, split each oversized file into 1 MB chunks in the same directory where the original file is located with a recognizable naming pattern (e.g., original_filename.part_aa), keep the original files intact, and verify that the chunks were created successfully.Example
# Before (files exceed 1 MB limit) /tmp/app/uploads/video.mp4 (3.2 MB) /tmp/app/data/archive.tar.gz (2.5 MB) Cannot upload due to size restrictions# After (files split into 1 MB chunks) /tmp/app/uploads/video.mp4 /tmp/app/uploads/video.mp4.part_aa /tmp/app/uploads/video.mp4.part_ab /tmp/app/uploads/video.mp4.part_ac /tmp/app/data/archive.tar.gz /tmp/app/data/archive.tar.gz.part_aa /tmp/app/data/archive.tar.gz.part_ab Chunks ready for upload within size limits
- Using Unmounted Partitions
More
Scenario
The server has unmounted partitions that are not being used and could be utilized for additional storage.
Task
Identify unmounted partitions that are safe to use (avoiding system-critical partitions like
/,/boot,/boot/efi, or swap), create an ext4 filesystem on one with a labeldata_extra, mount it at/mnt/test, and verify it's accessible.Example
# Before (unmounted partition unused) Block devices scanned, unmounted partitions found loop0p2: 20GB unmounted, no filesystem# After (partition formatted and mounted) Filesystem created: ext4 with label=data_extra on /dev/loop0p2 Mounted at: /mnt/test /dev/loop0p2 on /mnt/test type ext4 (rw,relatime) Partition ready for use
Networking 4 questions
- Fix Port Exhaustion for High Speed Scraper
More
Scenario
A
web-scrapersystemd service is running on this system, making continuous HTTP requests. The service has started experiencing connection failures - logs show HTTP status 000 errors, indicating connections cannot be established even though the network is functional and remote servers are accessible.Task
Check the service status and logs to understand the failures, investigate the underlying cause, identify which system resource is exhausted,
apply the appropriate kernel configuration change.Once fix is applied you can test the connection with
curl -o /dev/null -s -w "HTTP Status: %{http_code}\n" http://example.com
- Forward Traffic Between Ports
More
Scenario
A service on your server is running on port 8080, but you now need it to also be reachable on port 8081. The application cannot be restarted and its configuration cannot be changed.
Task
Verify the service is listening on
127.0.0.1:8080. Configureiptablesto forward all TCP traffic from port 8081 to port 8080, ensuring this works for both external requests and local connections (localhost). Verify the forwarding works, then save the rules to persist after a reboot usingiptables-saveExample
tcp 0 0 127.0.0.1:8080 0.0.0.0:* LISTEN 1234/java PREROUTING (policy ACCEPT) REDIRECT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:8081 redir ports 8080 OUTPUT (policy ACCEPT) REDIRECT tcp -- 0.0.0.0/0 127.0.0.1 tcp dpt:8081 redir ports 8080
- Inspecting HTTP Traffic Flow
More
Scenario
You suspect the web service isn't receiving HTTP requests, and you need to confirm network traffic to port 80.
Task
Capture network packets destined for or originating from port 80 (HTTP traffic), limit the capture to the first 10 packets to avoid large files, save the captured packets to
/tmp/http_traffic.pcapin pcap format, read the capture file and extract key information (source IP, destination IP with port, TCP flags), create a human-readable summary showing packet flow and TCP handshake details, and save the summary to/tmp/http_summary.txtin the formatSOURCE_IP -> DEST_IP:PORT [FLAGS].You may use
tcpdumpto capture and inspect packets.Important
You can run script below to save http_summary.txt instead of manually filling the file since main goal is test troubleshooting.
cat <http_traffic_file> | awk '/Flags/ && /IP/ { # Skip IPv6 packets, only process IPv4 if ($0 ~ /IP6/) next # Extract source and destination IPs WITH ports # Pattern: IP source.port > dest.port if (match($0, /IP ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\.([0-9]+) > ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\.([0-9]+)/, parts)) { src_ip = parts[1] src_port = parts[2] dst_ip = parts[3] dst_port = parts[4] # Extract flags if (match($0, /Flags (\[[^\]]+\])/, flags)) { print src_ip ":" src_port " -> " dst_ip ":" dst_port " " flags[1] } } }' > /tmp/http_summary.txt
- Validating Network Routes
More
Scenario
Your server uses multiple network interfaces and may have incorrect routing for a specific subnet. You need to verify and fix it to ensure proper traffic flow.
Task
Display the current routing table to identify existing routes, check if a route for the
10.10.0.0/16subnet exists and which interface and gateway it uses. If the route goes througheth0, delete the existing route and add a new route for10.10.0.0/16via gateway192.168.100.1using interfaceeth1.Example
# Before (incorrect route through eth0) 10.10.0.0/16 routed via eth0 Gateway: 192.168.50.1 Traffic experiencing packet loss# After (corrected route through eth1) 10.10.0.0/16 routed via eth1 Gateway: 192.168.100.1 Route verified and active
Security 1 question
- Fix HTTPS Certificate Error
More
Scenario
A minimal HTTPS webserver script (
webserver.sh) listening on port 8443 fails to establish secure connections. The bundled SSL certificate (old_server.crt) lacks a Subject Alternative Name (SAN) for the local IP127.0.0.1, causing hostname verification failures.Task
Run the broken server and inspect the certificate to confirm the missing SAN. Generate a new self-signed certificate with SAN set to
IP:127.0.0.1and save it asserver.crtandserver.key. Updatewebserver.shto use the new certificate files, launch the fixed server, and verify connectivity.Example
# Before (Missing SAN) subject=CN = wrong.example.com curl: (60) SSL: no alternative certificate subject name matches target host name '1.1.1.1' # After (SAN Present) subject=CN = example X509v3 Subject Alternative Name: IP Address:1.1.1.1 Hello World