Spring Batch Example CSV to MySQL: JPA Batch Insert java Spring Spring Batch Spring Boot by devs5003 - February 4, 2024February 15, 20242 Last Updated on February 15th, 2024Sometimes in real time projects, we need to transfer data from one location to another. If the volume of data is small, we can achieve this by applying any traditional approach. On the other hand, if there is a huge amount of data, we can make use of the Spring Batch API to make the transfer of data faster and performant. In this article ‘Spring Batch Example CSV to MySQL Using JPA’, we are going to transfer the data from CSV file to MySQL database using Spring Boot Batch. We have already discussed theoretical concepts of Spring Batch in a separate article ‘Spring Batch Tutorial‘. Now, its time to implement our example ‘Spring Batch Example CSV to MySQL Using JPA’. Table of Contents Toggle What will you be expecting from this Article?Why is this example so important?Why to use Spring Batch in this Example?Software/Technologies Used in the ExampleSpring Batch Example CSV to MySQL Using JPADetails of the Use case Step #1: Create a Spring Boot ProjectStep #2: Place your CSV fileStep #3: Modify application.properties fileStep #4: Create an entity class as Invoice.javaStep #5: Create an interface for repository as InvoiceRepository.javaStep #6: Create an optional listener class as InvoiceListener.javaStep #7: Create a configuration class as BatchConfig.javaStep #7A: Create a helper class as BlankLineRecordSeparatorPolicy.javaStep #8: Create a Job Launcher as JobRunner.javaHow to Test the Spring Batch implemented Application?Console OutputCommon Exceptions/Errors During developmentParsing Exception Cause of ExceptionSolutionBad SQL Grammer Exception Cause of ExceptionSolution What will you be expecting from this Article? After going through all the details in this article, you will be able to answer the following: 1) Why is this example so important to implement? 2) What are the technologies/software used to develop this example? 3) How to read records from a CSV file using FlatFileItemReader? 4) How to transform/process records using ItemProcessor? 5) How to write records into database using ItemWriter and Spring Data JPA? 6) How to use JobListener interface provided by Spring Batch? 7) How to launch a Spring Batch job using Runner? 8) How to test and verify that the records are inserted into the database successfully? 9) How to troubleshoot problems faced during the development of the example? 10) How to minimize the number of lines of code using Lambda Expressions? 11) Last but not the least, How to develop ‘Spring Batch Example CSV to MySQL Using Spring Boot & JPA’? Why is this example so important? In real time projects it is very common that the client provides you some data to insert in the application. Most of the times you will be provided with a CSV file filled with the huge amount of data. Now, if you have to insert data in the application, it means you need to insert the same in the database. Even sometimes in some applications it is a continuous real time process to insert data from CSV file to the database, may be using a scheduler. Why to use Spring Batch in this Example? Spring Batch supports scalable and parallel processing of data, which is important when processing large amount of data. This ensures improved performance and powerful processing, particularly when migrating large amounts of data from a CSV file to a MySQL database. Spring Batch offers built-in support for reading and writing data in chunks, which is beneficial for processing large datasets efficiently. It offers item readers and writers that can be configured to read data from a CSV file and write it to a MySQL database. Spring Batch is particularly designed for batch processing that makes it a normal choice for handling large volumes of data. Software/Technologies Used in the Example Sometimes some version conflicts with other version. Hence, listing down the combinations that are proven to be working with each other. Below is the proven combination of software that are used to develop these examples. 1) CSV (Microsoft® Excel® 2019) 2) Spring Boot 2.6.0 3) JDK 1.8 or later 4) Lombok 1.18.22 5) Spring Data JPA 2.6.0 6) spring-boot-starter-batch 2.6.0 7) Spring Framework 5.3.13 8) Maven 3.8.1 9) IDE – STS 4.7.1.RELEASE (Starters: ‘Spring Batch’, ‘MySQL Driver’, ‘Spring Data JPA’, Lombok) Spring Batch Example CSV to MySQL Using JPA In order to implement the Spring Boot Batch Example CSV to MySQL Using JPA using Spring Batch concept, follow the step by step tutorial given as below. Details of the Use case Let’s assume that we have a CSV file filled with huge volume of data. On the other hand, we have a MySQL database. We need to read the data from CSV, perform some data transformation, and then transfer the data to MySQL. Furthermore, for the database communication we will use Spring Data JPA. Our CSV file will contain data about Invoices such as invoice name, invoice number, invoice amount, discount, location etc. In order to implement data transformation, we will calculate the final amount on the basis of discount applied to the amount. Let’s start implementing our use case Spring Boot Batch Example CSV to MySQL Using JPA step by step. Step #1: Create a Spring Boot Project Here, we will use STS(Spring Tool Suite) to create our Spring Boot Project. If you are new to Spring Boot, visit the internal link to create a sample project in spring boot. While creating a project in STS, add starters ‘Spring Batch’, ‘MySQL Driver’, ‘Spring Data JPA’, and ‘Lombok’. However, we will use Lombok in order to generate boilerplate code automatically such as getters, setters, constructors, toString() etc. If you are not using Lombok, you can even write the required code manually. Step #2: Place your CSV file Generally, there are three places where we can place our CSV file as shown below. We call this location as Resource Location. 1) In Project Class Path: src/main/resources (ClasspathResource) 2) In local directory: D://mydata (FileSystemResource) 3) In Remote location: https://xyz.com/files/… (UrlResource) In our case, we will use the first location ‘Project Class Path’ which is under ‘src/main/resources’ folder in our STS. If you already have a CSV file, you can place it here. Otherwise, create a file with extension .csv at the same location and enter some data into it. It’s “invoices.csv” for our case. Below is the sample of “invoices.csv”. Step #3: Modify application.properties file Add below properties in your application.properties file. #Server server.port=8080 #Database Connection spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver spring.datasource.url=jdbc:mysql://localhost:3306/batchdb spring.datasource.username=root spring.datasource.password=devs #----------ORM Details------------------- #To display SQL At console spring.jpa.show-sql=true #To Create tables spring.jpa.hibernate.ddl-auto=create #To Generate SQL queries spring.jpa.database-platform=org.hibernate.dialect.MySQL8Dialect #----------Spring Batch Properties---------- # By default it's true which means all the Spring batches will start executing automatically spring.batch.job.enabled=false # Tables for metadata created by Spring Boot (Always, Embedded, Never) spring.batch.jdbc.initialize-schema=ALWAYS As shown above, we are using Spring Data JPA Concept to save data into MySQL. All properties are already commented. Step #4: Create an entity class as Invoice.java As aforementioned, we will have invoice related data in our CSV, we need to create a model/entity class invoice.java accordingly as shown below. Here, we have used Lombok to generate getters, setters, constructors etc. import javax.persistence.Entity; import javax.persistence.GeneratedValue; import javax.persistence.Id; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; @Entity @Data @NoArgsConstructor @AllArgsConstructor public class Invoice { @Id @GeneratedValue private Long id; private String name; private String number; private Double amount; private Double discount; private Double finalAmount; private String location; } Step #5: Create an interface for repository as InvoiceRepository.java In order to save data into database, create a repository interface InvoiceRepository.java which will extend JpaRepository interface as shown below. import org.springframework.data.jpa.repository.JpaRepository; Import com.dev.springboot.entity.Invoice; public interface InvoiceRepository extends JpaRepository<Invoice, Long>{ } Step #6: Create an optional listener class as InvoiceListener.java Let’s create a class InvoiceListenet.java which will implement JobExecutionListener. It has two methods beforeJob() and afterJob() overwritten from the interface JobExecutionListener as shown below. This is an optional class to create, if you want to execute some logic before and after the job execution. import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobExecutionListener; public class InvoiceListener implements JobExecutionListener{ @Override public void beforeJob(JobExecution jobExecution) { System.out.println("Job started at: "+ jobExecution.getStartTime()); System.out.println("Status of the Job: "+jobExecution.getStatus()); } @Override public void afterJob(JobExecution jobExecution) { System.out.println("Job Ended at: "+ jobExecution.getEndTime()); System.out.println("Status of the Job: "+jobExecution.getStatus()); } } Step #7: Create a configuration class as BatchConfig.java This is a configuration class where we create multiple objects that are required to perform batch processing. This is an important class for our Spring Boot Batch Example CSV to MySQL Using JPA. These objects are listed below: 1) ItemReader 2) ItemWriter 3) ItemProcessor 4) JobListener (Optional) 5) Step 6) Job As shown above the JobListener is an optional object. All other 5 objects are necessary to implement the Spring batch concept. Furthermore, we have used 3 autowiring in order to support the creation of objects: InvoiceRepository to create ItemWriter, StepBuilderFactory to create Step, and JobBuilderFactory to create Job respectively. Note: Don’t forget to apply 2 annotations in this class: @Configuration and @EnableBatchProcessing import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecutionListener; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.launch.support.RunIdIncrementer; import org.springframework.batch.item.ItemProcessor; import org.springframework.batch.item.ItemWriter; import org.springframework.batch.item.file.FlatFileItemReader; import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper; import org.springframework.batch.item.file.mapping.DefaultLineMapper; import org.springframework.batch.item.file.transform.DelimitedLineTokenizer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.io.ClassPathResource; import com.dev.springboot.entity.Invoice; import com.dev.springboot.listener.InvoiceListener; import com.dev.springboot.repo.InvoiceRepository; @Configuration @EnableBatchProcessing public class BatchConfig { //Reader class Object @Bean public FlatFileItemReader<Invoice> reader() { FlatFileItemReader<Invoice> reader= new FlatFileItemReader<>(); reader.setResource(new ClassPathResource("/invoices.csv")); // Reader. setResource(new FileSystemResource("D:/mydata/invoices.csv")); // reader.setResource(new UrlResource("https://xyz.com/files/invoices.csv")); // reader.setLinesToSkip(1); reader.setLineMapper(new DefaultLineMapper<>() {{ setLineTokenizer(new DelimitedLineTokenizer() {{ setDelimiter(DELIMITER_COMMA); setNames("name","number","amount","discount","location"); }}); setFieldSetMapper(new BeanWrapperFieldSetMapper<>() {{ setTargetType(Invoice.class); }}); }}); reader.setRecordSeparatorPolicy(new BlankLineRecordSeparatorPolicy()); return reader; } //Autowire InvoiceRepository @Autowired InvoiceRepository repository; //Writer class Object @Bean public ItemWriter<Invoice> writer(){ // return new InvoiceItemWriter(); // Using lambda expression code instead of a separate implementation return invoices -> { System.out.println("Saving Invoice Records: " +invoices); repository.saveAll(invoices); }; } //Processor class Object @Bean public ItemProcessor<Invoice, Invoice> processor(){ // return new InvoiceProcessor(); // Using lambda expression code instead of a separate implementation return invoice -> { Double discount = invoice.getAmount()*(invoice.getDiscount()/100.0); Double finalAmount= invoice.getAmount()-discount; invoice.setFinalAmount(finalAmount); return invoice; }; } //Listener class Object @Bean public JobExecutionListener listener() { return new InvoiceListener(); } //Autowire StepBuilderFactory @Autowired private StepBuilderFactory sbf; //Step Object @Bean public Step stepA() { return sbf.get("stepA") .<Invoice,Invoice>chunk(2) .reader(reader()) .processor(processor()) .writer(writer()) .build() ; } //Autowire JobBuilderFactory @Autowired private JobBuilderFactory jbf; //Job Object @Bean public Job jobA(){ return jbf.get("jobA") .incrementer(new RunIdIncrementer()) .listener(listener()) .start(stepA()) // .next(stepB()) // .next(stepC()) .build() ; } } In the below step, we have created a helper class BlankLineRecordSeparatorPolicy,java in order to handle blank lines in the CSV while reading the items by the FlatFileItemReader. Step #7A: Create a helper class as BlankLineRecordSeparatorPolicy.java import org.springframework.batch.item.file.separator.SimpleRecordSeparatorPolicy; public class BlankLineRecordSeparatorPolicy extends SimpleRecordSeparatorPolicy { @Override public boolean isEndOfRecord(final String line) { return line.trim().length() != 0 && super.isEndOfRecord(line); } @Override public String postProcess(final String record) { if (record == null || record.trim().length() == 0) { return null; } return super.postProcess(record); } } Step #8: Create a Job Launcher as JobRunner.java import org.springframework.batch.core.Job; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.JobParametersBuilder; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.CommandLineRunner; import org.springframework.stereotype.Component; @Component public class JobRunner implements CommandLineRunner { @Autowired private JobLauncher jobLauncher; @Autowired private Job jobA; @Override public void run(String... args) throws Exception { JobParameters jobParameters = new JobParametersBuilder() .addLong("time", System.currentTimeMillis()) .toJobParameters(); jobLauncher.run(jobA, jobParameters); System.out.println("JOB Execution completed!"); } } Here, we have created JobRunner class in order to launch the job. We can also use the Scheduler while doing real time implementation. Runner is used to execute a Spring Boot Application only once. How to Test the Spring Batch implemented Application? In order to test the Spring Boot Batch Example CSV to MySQL Using JPA, follow the steps as given below: 1) Start your Spring Boot Application and make sure that it is getting started without any exception/error. 2) You should see all the values of print statements in the console such as ‘Job started at’, ‘Status of the Job’, ‘Saving Invoice Records’, ‘Job Ended at’, ‘Status of the Job’ etc. Make sure that the value of “Status of the Job” should be ‘STARTED’ for the first time and ‘COMPLETED’ for the second time. 3) Check the records in the MySQL database whether they are uploaded properly or not. Console Output The console output will be something like below. We have 7 records in the SCV file. Since we have declared chunk size 2, records are being written to the database in 4 batches. First 3 batches are having 2 records which is equal to the chunk size. The 4th and last batch has the remaining only 1 record. Job started at: Fri Jul 01 21:01:55 IST 2022 Status of the Job: STARTED [2m2022-07-01 21:01:55.799[0;39m [32m INFO[0;39m [35m10336[0;39m [2m---[0;39m [2m[ main][0;39m [36mo.s.batch.core.job.SimpleStepHandler [0;39m [2m:[0;39m Executing step: [stepA] Saving Invoice Records: [Invoice(id=null, name=INV001, number=CSQL49SR, amount=496.75, discount=25.0, finalAmount=372.5625, location=Chicago), Invoice(id=null, name=INV002, number=QRQL96MT, amount=498.25, discount=15.0, finalAmount=423.5125, location=Houston)] Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Saving Invoice Records: [Invoice(id=null, name=INV003, number=QLMP84CS, amount=459.0, discount=22.0, finalAmount=358.02, location=Dallas), Invoice(id=null, name=INV004, number=SPMN49DZ, amount=475.5, discount=12.0, finalAmount=418.44, location=New York)] Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Saving Invoice Records: [Invoice(id=null, name=INV005, number=QLMP84CS, amount=495.0, discount=18.0, finalAmount=405.9, location=Paris), Invoice(id=null, name=INV006, number=SPMN49DZ, amount=434.5, discount=12.0, finalAmount=382.36, location=Germany)] Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) Saving Invoice Records: [Invoice(id=null, name=INV007, number=QLMP84CS, amount=439.24, discount=13.0, finalAmount=382.1388, location=United Kingdom)] Hibernate: select next_val as id_val from hibernate_sequence for update Hibernate: update hibernate_sequence set next_val= ? where next_val=? Hibernate: insert into invoice (amount, discount, final_amount, location, name, number, id) values (?, ?, ?, ?, ?, ?, ?) [2m2022-07-01 21:01:56.003[0;39m [32m INFO[0;39m [35m10336[0;39m [2m---[0;39m [2m[ main][0;39m [36mo.s.batch.core.step.AbstractStep [0;39m [2m:[0;39m Step: [stepA] executed in 204ms Job Ended at: Fri Jul 01 21:01:56 IST 2022 Status of the Job: COMPLETED Common Exceptions/Errors During development You may face various exceptions while developing this Spring Batch Example CSV to MySQL Using JPA. Some of them are listed below: Parsing Exception org.springframework.batch.item.file.FlatFileParseException: Parsing error at line: 8 in resource=[class path resource [invoices.csv]], input=[] Caused by: org.springframework.batch.item.file.transform.IncorrectTokenCountException: Incorrect number of tokens found in record: expected 5 actual 0 Cause of Exception This exception occurred while reading 8th line(8th record) in the CSV. Since we have only 7 records in the CSV, 8th record is blank. The ‘Caused By’ statement indicates that it was expected to be 5 columns, but there are 0 columns at line # 8. Solution In order to resolve this issue we need to create a record separator policy. In our example it is ‘BlankLineRecordSeparatorPolicy.java’ which extends ‘SimpleRecordSeparatorPolicy’ provided by Spring Batch API. Bad SQL Grammer Exception Caused by: org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [SELECT JOB_INSTANCE_ID, JOB_NAME from BATCH_JOB_INSTANCE where JOB_NAME = ? and JOB_KEY = ?]; nested exception is java.sql.SQLSyntaxErrorException: Table ‘batchdb.batch_job_instance’ doesn’t exist Cause of Exception By default Spring framework creates database tables of metadata for the Job Repository when the application starts. The spring boot batch is disabled to create the tables by default. The spring boot application should be enabled to create the tables to create and use the batch tables in the database. Solution In order to resolve this issue, we should add “spring.batch.initialize-schema” in the application.properties to create the batch tables in the database. The property should be set to “ALWAYS” to create the batch tables in the database. This property allows the spring boot batch to perform a database table check. Spring boot batch will use the tables if they exist. If the tables do not exist, it will create them for you. Therefore, add below entry into your application.properties file. spring.batch.initialize-schema=ALWAYS You may further refer official link to Spring Batch documentation. Related
Hello Sir, that’s an amazing tutorial. I kindly request you to do for flat file like a file with “.txt” extension. Thanks in advanced Reply
HI!, basically is the same logic, is by using FlatFileItemReader, in the documentation i think there is an example of how to read data from a .txt file Reply