Sunday, 30 November 2025

Tokens and Context Windows in LLMs

 

Tokens and Context Windows in LLMs

Last Updated : 23 Jul, 2025

In Large Language Models (LLMs), understanding the concepts of tokens and context windows is essential to comprehend how these models process and generate language.

What are Tokens?

In the context of LLMs, a token is a basic unit of text that the model processes. A token can represent various components of language, including:

  • Words: In many cases, a token corresponds to a single word (e.g., "apple," "run," "quick").
  • Subwords: For languages with a rich morphology or for more efficient processing, words may be split into subword tokens. For example, "unhappiness" might be split into "un," "happi," and "ness."
  • Punctuation: Punctuation marks like periods, commas, and exclamation marks are also treated as individual tokens.
  • Special Tokens: Special tokens are used for specific purposes, such as indicating the beginning or end of a sentence, padding tokens, or tokens for unknown words.

Tokenization is the process of breaking down text into these smaller units. Different models use different tokenization methods.

LLMs have a maximum number of tokens they can process in a single request. This limit includes both the input (prompt) and the output (generated text).

For example:

  • GPT-4 has a context window of 8,192 tokens, with some versions supporting up to 32,768 tokens.
  • Exceeding this limit requires truncating or splitting the text.

What is Context Window?

context window refers to the span of text (usually in terms of tokens) that a model can consider at one time when making predictions or generating text. In simpler terms, it is the "lookback" or the amount of previous information that the model uses to make sense of the current input.

Example-of-a-Context-Window

LLMs, such as GPT-based models, rely heavily on context windows to predict the next token in a sequence. The larger the context window, the more information the model can access to understand the meaning of the text. However, context windows are finite, meaning that models can only consider a certain number of tokens from the input sequence before the context is truncated.

Importance of Context Windows

  • Understanding Relationships: The context window helps the model understand relationships between tokens and words. For example, the context window allows the model to capture sentence structure, grammar, and even long-range dependencies (like subject-verb agreement).
  • Text Generation: When generating text, the context window allows the model to predict the next word or token based on the input text. The model's ability to generate coherent and contextually relevant text relies on having enough context.

The size of the context window directly impacts the model’s performance. If the window is too small, the model may lose the ability to consider important context, which can affect accuracy and coherence. On the other hand, larger context windows require more computation and memory, which can increase processing time and cost.

Tokens and Context Window in Modern LLMs

Tokenization in LLMs

Modern LLMs typically use a form of subword tokenization (e.g., Byte Pair Encoding, WordPiece, or SentencePiece) to handle a diverse vocabulary. This method ensures that words or phrases are broken down into smaller, more manageable parts, allowing the model to handle a broader range of inputs without requiring an immense vocabulary.

For example, using subword tokenization, the word "unbelievable" might be split into the following tokens: "un," "believ" and "able".

This way, even words that the model has never seen before can be processed effectively.

Context Windows in Transformer Models

Transformer-based models, such as GPT, BERT, and T5, leverage self-attention mechanisms that allow the model to focus on different parts of the input sequence. The context window in these models is defined by the maximum number of tokens that can be processed in parallel.

For example, GPT-3 has a context window of 2048 tokens, meaning it can process up to 2048 tokens at once when making predictions or generating text.

As the model moves through the text, the context window "slides" over the sequence, considering the most recent tokens within the window. This sliding window approach allows the model to maintain relevance to the most recent parts of the input while discarding older, less relevant tokens.

The following table outlines the tokenization technique and context window size of LLMs:

Model

Tokenization Method

Context Window Size

GPT-3

Byte Pair Encoding (BPE)

2048 tokens

GPT-4

Byte Pair Encoding (BPE)

8192 tokens (varies by configuration)

BERT

WordPiece

512 tokens

T5

SentencePiece

Varies (typically 512–1024)


Llama 3.1 8B

Byte Pair Encoding (BPE)

128,000 tokens


DeepSeek-R1-Distill-Llama-70B

Byte Pair Encoding (BPE)

128,000 tokens


Llama-3.3-70B-SpecDec

Byte Pair Encoding (BPE)

8,192 tokens

Trade-offs and Considerations

  1. Efficiency vs. Accuracy: A larger context window improves the model's ability to generate coherent text and understand complex relationships in the input. However, larger context windows require more computational resources, both in terms of memory and processing time. Balancing efficiency and accuracy is a critical consideration when designing LLMs.
  2. Memory Limitations: LLMs are constrained by the available memory. A larger context window means that the model must allocate more memory for storing tokens and their relationships. When the context window exceeds the model's capacity, earlier tokens may be discarded, potentially leading to a loss of important context.
  3. Fixed Context Windows: Some models have fixed context windows, meaning that once the window size is set during training, it cannot be changed. This limitation may affect the model's ability to handle longer text inputs, forcing truncation or the use of techniques like sliding windows.
  4. Sliding Context Windows: To address the limitations of fixed context windows, some models use a sliding window approach, where the context is updated as the model processes new tokens. This method ensures that the model always operates within the context window, but it may cause some loss of global information as tokens "fall out" of the window.


Understanding these concepts is key to optimizing LLM performance, whether you're training a new model or working with existing ones. As the field of natural language processing continues to evolve, future innovations may focus on improving how models handle tokens and context windows to create even more powerful and efficient LLMs.

Python try catch

 def test():

    try:

        return 1  # This executes first

    finally:

        return 2  # This executes last and OVERRIDES the previous return


result = test()

print(result)  # Output: 2

Wednesday, 15 October 2025

SQL Server Interview Question Answer

1. Find duplicate records in a table (Amazon)

SELECT column1, column2, COUNT(*)
FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1;

 

2. Retrieve the second highest salary from Employee table ()

SELECT MAX(salary) AS SecondHighestSalary
FROM Employee
WHERE salary < (SELECT MAX(salary) FROM Employee);

 

3. Find employees without department (Uber)

SELECT e.*
FROM Employee e
LEFT JOIN Department d ON e.department_id = d.department_id
WHERE d.department_id IS NULL;

 

4. Calculate the total revenue per product (PayPal)

SELECT product_id, SUM(quantity * price) AS total_revenue FROM Sales
GROUP BY product_id;

 

5. Get the top 3 highest-paid employees (Google)

SELECT TOP 3 * FROM Employee
ORDER BY salary DESC;

 

6. Customers who made purchases but never returned products (Walmart)

SELECT DISTINCT c.customer_id
FROM Customers c
JOIN Orders o ON c.customer_id = o.customer_id
WHERE c.customer_id NOT IN (SELECT customer_id FROM Returns);

 

7. Show the count of orders per customer (Meta)

SELECT customer_id, COUNT(*) AS
rder_count
FROM Orders
GROUP BY customer_id;

 

8. Retrieve all employees who joined in 2023 (Amazon)

SELECT * FROM Employee
WHERE YEAR(hire_date) = 2023;

 

9. Calculate average order value per customer (Microsoft)

SELECT customer_id, AVG(total_amount) AS avg_order_value
FROM Orders
GROUP BY customer_id;

 

10. Get the latest order placed by each customer (Uber)

SELECT customer_id, MAX(order_date) AS latest_order_date FROM Orders
GROUP BY customer_id;

 

11. Find products that were never sold ()

SELECT p.product_id FROM Products p
LEFT JOIN Sales s ON p.product_id = s.product_id
WHERE s.product_id IS NULL;

 

12. Identify the most selling product (Adobe/Walmart)

SELECT TOP 1 product_id, SUM(quantity) AS total_qty
FROM Sales
GROUP BY product_id
ORDER BY total_qty DESC;

 

13. Get total revenue and number of orders per region (Meta)

SELECT region, SUM(total_amount) AS total_revenue, COUNT(*) AS order_count

FROM Orders

GROUP BY region;

 

14. Count customers with more than 5 orders (Amazon)

SELECT COUNT(*) AS customer_count

FROM (

 SELECT customer_id FROM Orders

 GROUP BY customer_id

 HAVING COUNT(*) > 5

) AS subquery;

 

15. Retrieve customers with orders above average order value (PayPal)

SELECT *

FROM Orders

WHERE total_amount > (SELECT AVG(total_amount) FROM Orders);

 

16. Find all employees hired on weekends (Google)

SELECT *

FROM Employee

WHERE DATENAME(WEEKDAY, hire_date) IN ('Saturday', 'Sunday');

 

17. Find all employees with salary between 50000 and 100000 (Microsoft)

SELECT *

FROM Employee

WHERE salary BETWEEN 50000 AND 100000;

 

18. Get monthly sales revenue and order count (Google)

SELECT FORMAT(date, 'yyyy-MM') AS month,

 SUM(amount) AS total_revenue,

 COUNT(order_id) AS order_count

FROM Orders

GROUP BY FORMAT(date, 'yyyy-MM');

 

19. Rank employees by salary within each department (Amazon)

SELECT employee_id, department_id, salary,

 RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rk

FROM Employee;

 

20. Find customers who placed orders every month in 2023 (Meta)

SELECT customer_id

FROM Orders

WHERE YEAR(order_date) = 2023

GROUP BY customer_id

HAVING COUNT(DISTINCT FORMAT(order_date, 'yyyy-MM')) = 12;

 

21. Find moving average of sales over the last 3 days (Microsoft)

SELECT order_date,

 AVG(total_amount) OVER (ORDER BY order_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

FROM Orders;

 

22. Identify the first and last order date for each customer (Uber)

SELECT customer_id, MIN(order_date) AS first_order, MAX(order_date) AS last_order

FROM Orders

GROUP BY customer_id;

 

23. Show product sales distribution (percent of total revenue) (PayPal)

WITH TotalRevenue AS (

 SELECT SUM(quantity * price) AS total FROM Sales

)

SELECT s.product_id,

 SUM(s.quantity * s.price) AS revenue,

 SUM(s.quantity * s.price) * 100 / t.total AS revenue_pct

FROM Sales s

CROSS JOIN TotalRevenue t

GROUP BY s.product_id, t.total;

 

24. Retrieve customers who made consecutive purchases (2 Days) (Walmart)

WITH cte AS (

 SELECT id, order_date,

 LAG(order_date) OVER (PARTITION BY id ORDER BY order_date) AS prev_order_date

 FROM Orders

)

SELECT id, order_date, prev_order_date

FROM cte

WHERE DATEDIFF(DAY, prev_order_date, order_date) = 1;

 

25. Find churned customers (no orders in the last 6 months) (Amazon)

SELECT customer_id

FROM Orders

GROUP BY customer_id

HAVING MAX(order_date) < DATEADD(MONTH, -6, GETDATE());

 

26. Calculate cumulative revenue by day (Adobe)

SELECT order_date,

 SUM(total_amount) OVER (ORDER BY order_date) AS cumulative_revenue

FROM Orders;

 

27. Identify top-performing departments by average salary (Google)

SELECT department_id, AVG(salary) AS avg_salary

FROM Employee

GROUP BY department_id

ORDER BY avg_salary DESC;

 

28. Find customers who ordered more than the average number of orders per

customer (Meta)

WITH customer_orders AS (

 SELECT customer_id, COUNT(*) AS order_count

 FROM Orders

 GROUP BY customer_id

)

SELECT * FROM customer_orders

WHERE order_count > (SELECT AVG(order_count) FROM customer_orders);

 

29. Calculate revenue generated from new customers (first-time orders) (Microsoft)

WITH first_orders AS (

 SELECT customer_id, MIN(order_date) AS first_order_date

 FROM Orders

 GROUP BY customer_id

)

SELECT SUM(o.total_amount) AS new_revenue

FROM Orders o

JOIN first_orders f ON o.customer_id = f.customer_id

WHERE o.order_date = f.first_order_date;

 

30. Find the percentage of employees in each department (Uber)

SELECT department_id, COUNT(*) AS emp_count,

 COUNT(*) * 100.0 / (SELECT COUNT(*) FROM Employee) AS pct

FROM Employee

GROUP BY department_id;

 

31. Retrieve the maximum salary difference within each department (PayPal)

SELECT department_id,

 MAX(salary) - MIN(salary) AS salary_diff

FROM Employee

GROUP BY department_id;

 

32. Find products that contribute to 80% of the revenue (Pareto Principle)

(Walmart)

WITH sales_cte AS (

 SELECT product_id, SUM(qty * price) AS revenue

 FROM Sales

 GROUP BY product_id

),

total_revenue AS (

 SELECT SUM(revenue) AS total FROM sales_cte

)

SELECT s.product_id, s.revenue,

 SUM(s.revenue) OVER (ORDER BY s.revenue DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURR

FROM sales_cte s, total_revenue t

WHERE SUM(s.revenue) OVER (ORDER BY s.revenue DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRE

 

33. Calculate average time between two purchases for each customer (Meta)

WITH cte AS (

 SELECT customer_id, order_date,

 LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_date

 FROM Orders

)

SELECT customer_id,

 AVG(DATEDIFF(DAY, prev_date, order_date)) AS avg_gap_days

FROM cte

WHERE prev_date IS NOT NULL

GROUP BY customer_id;

 

34. Show last purchase for each customer along with order amount (Google)

WITH ranked_orders AS (

 SELECT customer_id, order_id, total_amount,

 ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn

 FROM Orders

)

SELECT customer_id, order_id, total_amount

FROM ranked_orders

WHERE rn = 1;

 

35. Calculate year-over-year growth in revenue (Microsoft)

SELECT FORMAT(order_date, 'yyyy') AS year,

 SUM(total_amount) AS revenue,

 SUM(total_amount) - LAG(SUM(total_amount)) OVER (ORDER BY FORMAT(order_date, 'yyyy'))

FROM Orders

GROUP BY FORMAT(order_date, 'yyyy');

 

36. Detect customers whose purchase amount is higher than their historical 90th

percentile (Amazon)

WITH ranked_orders AS (

 SELECT customer_id, order_id, total_amount,

 NTILE(10) OVER (PARTITION BY customer_id ORDER BY total_amount) AS decile

 FROM Orders

)

SELECT customer_id, order_id, total_amount

FROM ranked_orders

WHERE decile = 10;

 

37. Retrieve the longest gap between orders for each customer (Meta)

WITH cte AS (

 SELECT customer_id, order_date,

 LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_order_da

 FROM Orders

)

SELECT customer_id,

 MAX(DATEDIFF(DAY, prev_order_date, order_date)) AS max_gap

FROM cte

WHERE prev_order_date IS NOT NULL

GROUP BY customer_id;

 

38. Identify customers with revenue below the 10th percentile (Google)

WITH cte AS (

 SELECT customer_id, SUM(total_amount) AS total_revenue

 FROM Orders

 GROUP BY customer_id

)

SELECT customer_id, total_revenue

FROM cte

WHERE total_revenue < (

 SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER BY total_revenue) FROM cte

);

 


Friday, 1 August 2025

R1RCM Interview for RPA AA360

 Interviewer :  Abhishek and Panday & Baijnath HR


Q1- Introduction yourself and professional experience

Q2-what is controler in MVC

Q3-What is viewbag

Q4-whats your current project

Q5-whats your Role

Q6-Automation Life Cycle

Q7-Automation Best Practice

Q8-AARI Framework

Q9-what is Trigeer 

Q10- how you will handle if any event occurs for folder or file or others

Q11-Exception handling
Q12-Credential mamager
Q13-whats the othersway to credential manager other then CyberArk

Q14-CyberArk Architecture in behind
Q15-How you can manage credentail in Automation Anywhere
Q16-Whats the Encryption
Q17-Types the Encrption
Q18-Different name of Encrytion algorithm and how to do that
Q19-whats the challenges you face ith RPA
Q20-Types of Exceptions
Q21-Who is responsible for Logical error
Q22-whats the best way to find element on  web page
Q23- How you will handle for issue if Element position change in page in future    then how you fix and whats the best way
Q24-how you will design Automation framework
Q25-have you install AA in your machine now
Q26-in which language you are comfortable
Q-27 write C# code to check prime number in compiler

Q-28  We are Done you want to ask something
Q-29 You need .Net mostly question so de we are working on .net or Automation?
Q30-Are we using Automation advance like API Gen AI and cloud
Q31- are we Product base company or Service Base
Q32- its a support work or Development work , because you ask mostly support suestion


Here’s a complete and professional set of interview answers tailored for your profile (13+ years in IT, 9+ years in RPA with AA, .NET knowledge, working in BFSI and healthcare domains):


Q1. Introduce yourself and your professional experience.

Hi, my name is Rakesh Kumar. I have 13+ years of overall IT experience, including over 9 years in Robotic Process Automation. I’ve worked extensively with Automation Anywhere (both v11 and A360), and I’m certified at both the Advanced and Master levels. My core expertise lies in architecting automation solutions, developing bots, and integrating with APIs and external systems. I’ve worked in BFSI, healthcare, and life sciences domains and had overseas experience in Malaysia and Uruguay. I’ve led multiple end-to-end automation implementations and currently work as a Senior RPA Specialist.


Q2. What is a Controller in MVC?

In MVC, a Controller handles incoming user requests, processes the data via business logic or model, and returns the appropriate view. It's the main decision-maker that connects the Model and the View.


Q3. What is ViewBag?

ViewBag is a dynamic property in ASP.NET MVC used to pass data from Controller to View during runtime without using a strongly typed model.


Q4. What's your current project?

I’m currently working on automating financial reconciliation and document processing workflows using Automation Anywhere A360, integrating it with APIs, Excel automation, and SQL Server. It includes audit trail, logging, exception handling, and secure credential management.


Q5. What’s your role?

I’m playing the role of Solution Architect and lead developer. I’m responsible for requirement analysis, bot design, review, deployment, performance optimization, client demos, and mentoring junior developers.


Q6. Automation Life Cycle?

  1. Opportunity Identification

  2. Process Assessment

  3. Solution Design

  4. Development

  5. Testing (UAT & SIT)

  6. Deployment

  7. Monitoring & Support

  8. Maintenance & Enhancements


Q7. Automation Best Practices

  • Use config files for dynamic values

  • Implement proper exception handling

  • Create reusable sub-bots

  • Maintain proper logging and audit trail

  • Use credential vaults for security

  • Follow naming conventions and documentation


Q8. What is AARI Framework?

AARI (Automation Anywhere Robotic Interface) allows users to trigger bots via a web-based or desktop interface for attended automation. It helps in human-in-the-loop scenarios.


Q9. What is a Trigger?

A trigger is an event that initiates the bot, such as:

  • File creation/modification

  • Email arrival

  • Schedule-based

  • Database event


Q10. How to handle file/folder events?

Use the Trigger Package in A360, set it for a folder path, and define which event (create/delete/modify) should trigger the bot. Use error handling and logs for robustness.


Q11. Exception Handling

I use Try-Catch blocks and ErrorHandler sub-bots to log error details like:

  • Form/Function Name

  • Line Number

  • Screenshot

  • Error Message
    I also send email alerts using the Email package.


Q12. Credential Manager

In AA, the Credential Vault securely stores credentials. You can create lockers, assign access, and fetch credentials using the Credential package.


Q13. Alternatives to CyberArk

  • Azure Key Vault

  • AWS Secrets Manager

  • HashiCorp Vault

  • AA Credential Vault

  • KeePass (in small orgs)


Q14. CyberArk Architecture

CyberArk stores credentials in a secure Vault Server. Bots authenticate via a connector or plugin, retrieve credentials using policies, and never expose them in plain text. It uses secure APIs and logs all access.


Q15. How to manage credentials in AA?

Use Credential Vault:

  • Create a locker

  • Add credentials

  • Assign access

  • Use the “Get Credential” action to fetch values during bot execution securely


Q16. What is Encryption?
Encryption is the process of converting readable data into an unreadable format to protect it from unauthorized access.


Q17. Types of Encryption

  • Symmetric Encryption (Same key)

  • Asymmetric Encryption (Public/Private key pair)


Q18. Encryption Algorithms

  • AES (Advanced Encryption Standard)

  • RSA (Rivest–Shamir–Adleman)

  • DES (Data Encryption Standard)

  • Blowfish
    Example in .NET:

csharp
using System.Security.Cryptography;


Q19. Challenges faced in RPA

  • Unstable UI selectors

  • Credential management

  • Handling unstructured data (PDFs/emails)

  • Delay in backend response

  • Process changes post-deployment


Q20. Types of Exceptions

  • Application Exception

  • Business Rule Exception

  • System Exception

  • Timeout Exception


Q21. Who is responsible for Logical Errors?

The developer is responsible for logical errors due to incorrect flow or calculations.


Q22. Best way to find a web element?

  • Use stable attributes like id, name, class

  • Prefer XPath or CSS selectors

  • Use AA's Object Cloning or Universal Recorder with fallback logic


Q23. How to handle element position change?

Use dynamic locators, wildcards, or anchor-based methods. Maintain a central selector repository or config and use retry with timeout logic.


Q24. How do you design an automation framework?

I use modular architecture:

  • Main bot calls reusable sub-bots

  • Config-driven design

  • Separate folders for Logs, Screenshots, Inputs

  • ErrorHandler sub-bot

  • Credential locker access

  • Audit Trail DB logging


Q25. Have you installed AA in your machine?

Yes, I have Automation Anywhere A360 Community Edition and enterprise experience. I use it for PoCs and testing.


Q26. Which language are you comfortable with?

Primarily C#, VB.NET, Python, and scripting in Automation Anywhere.


Q27. C# Code to Check Prime Number

csharp
int number = 29; bool isPrime = true; for (int i = 2; i <= number / 2; i++) { if (number % i == 0) { isPrime = false; break; } } Console.WriteLine(isPrime ? "Prime" : "Not Prime");

Q28. Do you have any questions for us?

Yes, I’d like to know:

  • What is the team structure?

  • Are there any plans to integrate GenAI or advanced analytics into your automation roadmap?

  • What would success look like in this role within the first 3 months?


Q29. Are we working on .NET or Automation?

From the discussion, I sense both technologies are important. Could you clarify the percentage of work expected on .NET and Automation?


Q30. Are we using advanced Automation like API, GenAI, Cloud?

That’s great if the organization is exploring these. I have experience in REST APIs and am exploring GenAI integrations for document understanding. Would love to contribute.


Q31. Is it a Product or Service-based company?

Just to align better, could you please confirm if it's a product-based company or a service provider?


Q32. Is it Support or Development work?

That’s a helpful distinction. Since many questions were support-oriented, I’d love to confirm if the role is focused more on support or on development and design.