Read large text file batch wise using c#

watch_later 02 February, 2022

 In this article, I'll show you how to read a large text file in c#. I will also explain how to create batches for reading a large text file using c#. To read the text file I'll use CustomFileReader class, where i will implement IEnumerable interface to read batch wise sequential series of characters as well as texts from a text file, Here, will also explain how to read a text file and convert it into DataTable using the c# with example.

In my previous article I 'explained how to read delimited text file. I got multiple email requests for sharing articles on reading large file. While you working with any large file sometimes you will get error of out of memory exception, just because of you reading the largest content from the file and storing the data into in-memory.

Read large text file batch wise using c#

So, In this article, I will show you how to overcome this issue and read largest text file batch wise using c#. We will create multiple batches of records and process the data batch wise one by one.

You can also read my article about Read And Write Text Files In ASP.NET using C#

Recently, in one of my projects, I got the requirement for reading the data from the large delimited text file and converting it into DataTable and insert these records into database. On the internet, I found many different ways to archive this requirement but I created a very easy and optimized way to read the large text file using c#. Let's create a sample application for reading text files and converting them into DataTable using C# so you can get more ideas about it.

In my previous articles, I explained,

  1. Calculate The SUM of the DataTable Column
  2. C# | Datatable to CSV
  3. Export All The Excel Sheets to DataSet in C# and VB.NET
  4. Get Distinct Records From Datatable using LINQ C#
  5. Read CSV File In ASP.NET With Example C# and VB.NET
  6. Export Dataset/Datatable to CSV File Using C# and VB.NET

That you might like to read.

Let's start our application step by step.

Step 1: Open Visual Studio.

Step 2: Create a new project (for a demonstration I created a console application that is in .Net core 5.0)

           Search and Select Console Application >> Click Next.

Create New Project

Step 3: Now, you have to configure the project.

            Provide the name of your project >> Select location for your project >> Click Next.

Configure New Project

Step 4: In the next window, you have to select your targeted framework. (I selected framework 5.0)

Step 5: You can see the project is created successfully, now you need to create one delimited text file and put it into the folder directory as shown below.

Create Delimited Text File


In the created text file I wrote the following sample pipe separated records.

Id|Type|Title|Author|Date
1101|article|Angular 12 CRUD Operation|Nikunj Satasiya|01-01-2022
1102|Blog|Google Cloud Platform For Student|Nikunj Satasiya|02-01-2022
1103|article|Best Programming Articles for Software Development|Nikunj Satasiya|08-01-2022
1104|Blog|How to Export PDF from HTML in Angular 12|Nikunj Satasiya|09-01-2022
1105|article|Angular PowerShell ng.ps1 Can Not be Loaded and Not Digitally signed|Nikunj Satasiya|10-01-2022
1106|article|Why Do Students Need Online Python Programming Help?|Nikunj Satasiya|11-01-2022
1107|Blog|Angular 12 Bar Chart Using ng2-Charts|Nikunj Satasiya|12-01-2022
1108|Blog|Rename Column Only If Exists in PostgreSQL|Nikunj Satasiya|15-01-2022
1109|article|Create REST API with ASP.NET Core 5.0 And Entity Framework Core|Nikunj Satasiya|20-01-2022

Step 6: Open, the program.cs file and import the following namespace library.

using System.Data;
using System.IO;

Step 7: Create Interface for Custom File Reader.

using System;
using System.Collections.Generic;
using System.IO;
 
namespace Codingvila_ReadDelimitedFile
{
    /// <summary>
    /// Custom File reader to enumerate the lines in a batch
    /// </summary>
    public class CustomFileReader : IEnumerable<List<string>>,
        IDisposable
    {
 
        // The inner stream reader object
        StreamReader sr;
        int _batchSize = 1;
 
        /// <summary>
        /// Constructor
        /// </summary>
        /// <param name="path">File path</param>
        /// <param name="batchSize"> Size of the batch,should be greater than 0</param>
        public CustomFileReader(string pathint batchSize)
        {
            if (batchSize > 0)
            {
                _batchSize = batchSize;
            }
            else
            {
                throw new ArgumentException("Batch size should be greater than Zero""batchSize");
            }
            sr = File.OpenText(path);
        }
 
        public void Dispose()
        {
            // close the file reader
            if (sr != null)
            {
                sr.Close();
            }
        }
 
        // IEnumerable interface
        public IEnumerator<List<string>> GetEnumerator()
        {
            string input = string.Empty;
 
            while (!sr.EndOfStream)
            {
                int i = 0;
 
                List<stringbatch = new List<string>();
 
                // if not EOF, read the next line
                while (i < _batchSize && !string.IsNullOrEmpty((input = sr.ReadLine())))
                {
                    batch.Add(input);
                    i++;
                }
                if (batch.Count != 0)
                {
                    yield return batch;
                }
            }
            Dispose();
        }
 
        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }
}
As you can see in the created class, we have implemented a IEnumerable interface, where I have written the logic for custom file reader to enumerate the lines in a batch to read file batch wise. 

Step 8: Now, in the same class, I have created one parameterized function that returns DataTable.

public static DataTable Read_Large_File(string FilePathchar delimiterbool isFirstRowHeader = true)
        {
            try
            {
                DataTable objDt = new DataTable();
                CustomFileReader reader = new CustomFileReader(FilePath, 2);
 
                foreach (List<stringbatch in reader)
                {
                    foreach (var Item in batch)
                    {
                        string[] arrItems = Item.Split(delimiter);
 
                        if (objDt.Columns.Count == 0)
                        {
                            if (isFirstRowHeader)
                            {
                                for (int i = 0; i < arrItems.Length; i++)
                                    objDt.Columns.Add(new DataColumn(Convert.ToString(arrItems[i]), typeof(string)));
                                continue;
                            }
                            else
                            {
                                for (int i = 0; i < batch.Count; i++)
                                    objDt.Columns.Add(new DataColumn("Column" + Convert.ToString(i), typeof(string)));
                            }
                        }
                        objDt.Rows.Add(arrItems);
                    }
                }
                return objDt;
            }
            catch (Exception)
            {
                throw;
            }
        }

Explanation:

As you can see in the code above, here I have created a parameterized function that accepts 3 different paramiters string FilePath for the path of delimited text file, character delimiter for the delimiter contained the text file as well as Boolean parameter isFirstRowHeader for identifying whether the first row of the text file is contained header information or not.

Then I have created a new object objDt for DataTable. After that, I have created the object of CustomFileReader class and pass the required parameter into it and suing the foreach loop process the data batch wise. For every batch I split the entire element of list and store each piece of data into a string array called arrItems.

For the header, I identified the header based on the boolean flag isFirstRowHeader, if this flag is true, means we need to consider 1st row of a text file as header, and the column name of DataTable should be as per the name available in the first row of the text file. If the flag isFirstRowHeader is false then the column name will be generated something like column1, column2 column 3 and etc. based on the count of total items available in the first row of the text file.

Finally, for all other records, I inserted into DataTable using the DataRow and returned the prepared DataTable as a Result/Response/Output of created function called Read_Large_File.

Step 9: Call Created Function ReadFile into Main Method.

static void Main(string[] args)
        {
            //Fatch File Path
            string path = @"F:\Codingvila\Codingvila_ReadDelimitedFile\Files\Data.txt";
 
            //Call Readfile method and pass required parameters
            DataTable dtTable = Read_Large_File(FilePath: path, delimiter: '|', isFirstRowHeader: true);
 
            // Print Data of Datatable
            foreach (DataRow dataRow in dtTable.Rows)
            {
                foreach (var item in dataRow.ItemArray)
                {
                    Console.WriteLine(item);
                }
                Console.WriteLine("\r\n");
            }
            Console.ReadKey();
        }

Explanation:

As you can see in the code above, here I have called the created function Read_Large_File and stored the Result/Response/Output of this particular function into Datatable.

Now, you can manipulate all the records available in the Datatable based on your needs or requirement. Here, for demonstration purposes, I printed the records using for each loop.

Note: If you noticed then here, I have used statement Console.ReadKey(); at the end of the function. This statement generally waits for user input. Actually, if you do not write this statement your result/output window will not preserve on the screen. 

Output

Result


Summary

In this article, we learned, how to read large text files as well as implement the batch wise logic and  a way to prepare DataTable by reading a text file in C#.

Codingvila provides articles and blogs on web and software development for beginners as well as free Academic projects for final year students in Asp.Net, MVC, C#, Vb.Net, SQL Server, Angular Js, Android, PHP, Java, Python, Desktop Software Application and etc.

Thank you for your valuable time, to read this article, If you like this article, please share this article and post your valuable comments.

Once, you post your comment, we will review your posted comment and publish it. It may take a time around 24 business working hours.

Sometimes I not able to give detailed level explanation for your questions or comments, if you want detailed explanation, your can mansion your contact email id along with your question or you can do select given checkbox "Notify me" the time of write comment. So we can drop mail to you.

If you have any questions regarding this article/blog you can contact us on info.codingvila@gmail.com

sentiment_satisfied Emoticon