Automation of An External Data Feed Extraction and Assembly Process

By Rajaganapathy K, Santhosh Kumar, Talha Khan, G.N. Shah, Ronald Mueller

This is the second in a series of posts summarizing case studies on robotic process automation (RPA). These posts describe a growing number of Macrosoft projects done recently to solve client problems as well as several to automate internal processes within our company. These are real-life case studies of RPA. The automation apps we have built and tested are now working as intended or are ready to move into production. In all cases, we are using Microsoft Power Automate to build and deploy the automation robots.

We are doing this post series to show the varied nature of RPA apps, from ultra-simple to highly complex over a broad range of business processes. We hope readers will find parallels in their own business processes. 

Stated simply, our view is that, that companies should seek out all processes that can be automated and build RPA apps to do that work. Gartner defines this view as ‘hyper-automation, namely, “the idea that anything that can be automated in an organization should be automated.”

We hope this series of RPA case study posts will stimulate readers’ further interest in moving their companies in that direction. We are here to help.

Download the Case Study as PDF

Case Study 2: The Automation Challenge

This case study involves the automation of external data extraction and assembly process for one of our clients. Macrosoft has been providing managed services to this client for many years, including developing and supporting several internal apps and their public website. We have also been giving integration services for their website to other enterprise apps, including Salesforce CRM and an external email marketing and delivery service.

We set about trying to find mundane tasks to automate within this company, and in conjunction with the client team, we came upon the current external data assembly process. The business requires a complex data extraction and assembly process involving gathering data from multiple external FTP sites, websites, and emails daily. The data extraction and assembly process was being done entirely manually until we undertook to automate it.

Here are some of the challenges the client faced with the current manual process. These had to be addressed in moving to a new automated process.

  1. Multiple Data Providers, Data Sources, Formats:

Pulling source data was complicated and time-consuming. The external sources of the data had different formats, structures, and types. Moreover, once the data was fetched, it needed to be validated to make it compatible with the destination system before making it available for internal and end-users.

  • Data Integrity, Transformation & Security:

Data quality was a primary concern in handling the data from multiple external sources. Poor data quality would create a compounding problem that could lead to compliance issues. If invalid or incorrect data were passed downstream in the data processing stages, it could lead to corrupt and incorrect results, a major problem for the company.

  • Scalability:

The client is faced with significant heterogeneity of the data from the diverse sources and the need to assemble and standardize all this data into a unified data structure and system. The client is expecting to add many more data sources in the near future, which will lead to substantial growth in data volume over the current situation. To tackle this challenge, the client required us to employ a robust integration solution that can handle high volume and disparity in data without compromising on performance.

Deeper Dive into the Case Study

As noted above, this second case study involves a client wanting to automate and synchronize data extracts from a whole range of external sources. We automated this process using Microsoft Power Automate, our go-to automation tool[1].

There were three parts to the automation as outlined below. Each was done in succession using Microsoft Power Automate. All three automation stages are done, and the robots are in production handling the data flows.

Stage 1. FTP Sites:

The client connects to many FTP accounts of vendors to fetch data made available by these vendors.  The first critical issue in automating this stage is security – security of access and connectivity to these data repositories is critical. The client requested a cost-effective, efficient, and scalable solution for automating the file transfer from multiple external FTP servers to the client’s internal enterprise Microsoft OneDrive platform.

Manual data transfer processes are vulnerable and subject to human error, making them inefficient and often unreliable. In automating these data transfers, our requirement was to reduce or eliminate the need for manual file exchanges and ensure high standards of data security. Automating these file transfers takes the manual file exchange burden off individual team members.

Macrosoft implemented this process using Microsoft Power Automate SFTP connector for automating the file import process and validating files received from each of the data providers. The automation solution was set up to alert designated parties with notification emails whenever the system encountered a delayed or missing file from any one of the data suppliers. Also, the automation process significantly reduced manual verification efforts and, at the same time, restricted access of employees to multiple FTP servers.

Stage 2. KPI Index Sites:

The client then requested Macrosoft to automate collecting and aggregating multiple KPI files and assembling the individual KPI index files into one file format (pipe delimited). Automating opening the individual KPI index files from the various providers was the key to this stage of work. The aggregated pipe-delimited KPI index files needed to be saved to the client’s OneDrive platform, and a copy had to be uploaded to an external vendor. 

The aggregated file included two additional columns for the date & source of the data generated dynamically when each file was received. Other fields in the data had to be updated at specific times of the day to reflect conditions at those times, instead of the values in place when the files were extracted from the vendor site.

Macrosoft has demonstrated the automation capability of Microsoft Power Automate in the prior stage of the project by creating automated workflows to fetch the data files from various FTP sites and saving the data to the client‘s OneDrive platform.

We implemented this second stage process also using Microsoft Power Automate. The steps in this automated process included: fetching the KPI index files from the external sites; opening these files containing XLS, CSV, SDC formats: utilizing the workflow capabilities in Power Automate to parse the file contents; aggregating the data to create the merged single output file in the required format (pipe delimited); and finally transmitting the merged file to the destination FTP site and keeping a copy in the client’s OneDrive system. The workflow was set up with necessary alert notifications at various trigger points. 

Stage 3. Email Attachments (in the process):

The client’s Compliance team receives multiple KPI index files as email attachments from suppliers. The Compliance team then creates a Master File for their internal usage. Until the automation process is completed, the team manually fetches the data from these emails and applies specific calculations to the data.

Macrosoft will utilize Microsoft Power Automate to fetch the KPI index files from the email attachments and utilize the workflows available in Power Automate to filter the data and apply all necessary calculations, create the Master File in the required format and finally transmit the Master File to the Compliance team via email and also store a copy on the client’s OneDrive system.

Benefits of this Automation to the Client

There are three major benefits to the client from this automation process:

  1. Increased Employee Satisfaction and Decreased Errors:

The client’s team was spending 3-5 hours a day manually downloading the files from each FTP server and utilized manual processes and macros to create the output spreadsheets and files, the aggregated Master File, and various other reports. All of this was done without significant and sustained validation on a daily basis of the input files received or the final data files and reports created after all data assembly and calculations.

Implementation of Power Automate has cut down this manual time to 5 minutes and, equally important, has decreased the risk of errors at any point in the processes. Employees now feel a sense of freedom and accomplishment as the mundane task is taken care of by automation. The employees can spend their valuable time on other higher-value tasks. 

  • Reduced Application Development Cost:

The client initially considered developing internal tools and applications (using .NET, Python, PowerShell) to do this automation. Macrosoft’s solution, using the low-code Power Automate platform, drastically reduced the effort, cost, and time of developing the solution while achieving all client goals.

As an average benchmark, the cost of creating an app with Power Automate can be as much as 70% less than creating the app via the standard development process. About the same factor applies as well to the time it takes to build the Power Automate app.

  • Data Integrity & Security:

Automated alert notifications were set up at every step in the automation process to alert designated team members whenever a problem was encountered. For example, files were missing, file contents were empty, or the process failed in completing a task.

A support ticket is automatically created, so team members can follow up with the specific data provider where the problem is diagnosed. This helped the client significantly in maintaining complete and accurate compliance documentation.

These alerts are also helpful in cases where there is any change in FTP account credentials, FTP transmission methods, changes to file structures or data elements. All of these will stall or cause process errors in the workflows and need to be flagged. With the automation process now in place, these alerts are immediate, and they identify the specific external site or process step where the error is encountered.

Clearly, in many of these error cases, a team member will need to manually diagnose the problem and identify the changes needed to the workflows. Since the system is built using Power Automate, most all such changes can be handled via the user interface without the need for a developer to step in.

Download the Case Study as PDF

General Observations

This section highlights some observations specific to this case study, but we also feel are generally relevant and important.

  1. Manual data transfer processes are vulnerable and subject to human error, making them inefficient and often unreliable. Automating data transfers can reduce or eliminate the need for manual file exchanges and, in so doing, retain and enhance data security. Also, automating file transfers takes the manual (and usually mundane) file exchange burden off individual team members.
  2. It is straightforward to automate this type of process using the Power Automate solution (SFTP connector). In this case, we automated data transfers from multiple FTP sources, eliminating the need for manual file exchanges. The solution was set up to alert team members with notification emails on any delayed or missing files. This reduced manual verification efforts, and at the same time, it lessened the need for employees to access multiple FTP servers directly, never a good practice.
  3. Macrosoft implemented Power Automate workflow capabilities for fetching files, opening them (XLS, CSV, SDC format), parsing contents, aggregating the data to create a merged single output, and finally transmitting an output file to designated destinations. Macrosoft also automated the process of fetching files from email attachments. All of this was done using the user interface in Power Automate.

Power Automate

Power Automate is Macrosoft’s go-to RPA implementation technology. Essentially all robots we have built to date have been built using Power Automate. It is well rated by technology rating companies such as Gartner, and it is clear to us that this platform will continue to advance quickly to the forefront of the field. Moreover, it is easy to use and generally much cheaper for mid-sized companies than its competitors, especially for Microsoft-based companies.

Besides the case study series of papers, we have a second series of papers on our website describing the major functions and capabilities of Microsoft Power Automate and how each comes into play in our RPA professional services practice.

We hope by writing these posts; we can better communicate with our clients and potential clients how well equipped and ‘ready for action’ Power Automate is in allowing us and them to quickly, efficiently, and at low cost build a whole ensemble of RPA apps that fulfill the vision of hyper-automation.

Macrosoft’s RPA Experience

Macrosoft is rapidly expanding its technical expertise in implementing RPA technologies based on the Power Automate toolset. We consider this to be a significant and growing technology area for us, one in which we are investing significant resources and efforts in training and certifying our developers and data engineers.

It is also one in which we are looking to do as many projects as possible, either internally or in conjunction with our partners and clients. We want to expand our experience to as many use cases as we can, as quickly as we can, so if you have some specific business process in mind, contact us, we will be glad to discuss it with you, and possibly we can work together to automate it!

We see this tech area consisting of two major components of work for us:

  1. Implementing automation process modules in which we work with a business owner to map out the automation tool’s rules and conditions must follow, either in an attended or unattended method. We are already doing this work in many of our case studies.
  2. Implement AI/ML models to intelligently understand and act upon document content and/or other content essential to a particular automation process. This component is used in some of the case studies we have done, and you can read about them in our other posts. This is a growing concentration of our work, and we invite you to contact us to hear more about how this works in real-life RPA apps.

Call us if you want to discuss how RPA might fit into your business or you have a specific business problem or challenge where you think RPA might make a difference. We will be happy to discuss this with you.


[1] A later section of this post describes Macrosoft’s capabilities with RPA in general and with Microsoft
Power Automate in particular

Share this:

By Rajaganapathy K, Santhosh Kumar, Talha Khan, G.N. Shah, Ronald Mueller | July 8th, 2021 | Process Automation

About the Author

Raja K Director, International VP of Technology

Rajaganapathy K

Rajaganapathy brings more than 22 years of hands-on experience in Business Development & Technology in information systems to Macrosoft Inc. As International VP of Technology. Raja leads all the international development, business, CCM, and Web Development practices. Additionally, Raja oversees pre-sales activities and marketing.

Raja is passionate about business-- his focus on collaborative team-building and his commitment to customer satisfaction. Further, Raja is an excellent motivator who leverages his business and technical acumen to deliver projects effectively. Above all, Raja maintains a record of success in his 15 years of wearing many hats at Macrosoft.

Raja holds a Bachelor of Engineering in Computer Science from Madras University.

Santhosh Kumar

Santhosh Kumar

Santhosh Kumar works as a Technical Lead in Macrosoft web development team. He has over 13+ years of experience in architecting & developing various web-based applications. He is expertise in Microsoft Power Platform. He has implemented several automation solutions for day-to-day complex tasks for Macrosoft clients.

Santhosh Kumar holds a Master's degree in Computer Applications.

Talha Khan

Talha Khan

Talha is a Advanced Analytics and Data science professional at Macrosoft. He has worked on multiple Big Data and Analytics projects and holds multiple certifications in analytics and data science-related technologies, including CallMiner. Talha is passionate about leveraging advanced analytics and data science tools to help clients further their interests. He has a Bachelor of Science in Accounting and Finance with a minor in Economics from the University of New Hampshire.

G. N. Shah Chief Technology Officer of Macrosoft

G.N. Shah

Shah is a forward thinking, institutional leader with eighteen years of experience. Throughout his tenure, Shah has delivered top notch customer solutions in large scale and enterprise environments. His proven abilities as a technology visionary and driver of strategic business systems development allow Macrosoft to deliver best in class software solutions. Shah currently holds a compiler patent with the US Patent Office.

Shah holds an MBA (Computer Science), in addition to 20+ professional and technical certifications. While he is proficient in a variety of development languages, his preferred language is Python. His areas of expertise include enterprise-wide architecture, application migration, IT transformation, mobile, and offshore development management. Shah’s ultimate goal, at Macrosoft, is to create a larger offering of product-based services while adopting new technologies.

In that rare instance when he has time for leisure, Shah is an avid cricket and football fan– as well as a weekly racquetball player.

Dr. Ronald Mueller CEO of Macrosoft

Ronald Mueller

Ron is CEO and Founder of Macrosoft, Inc. He heads up all company strategic activities and directs day-to-day work of the Leadership Team at Macrosoft. As Macrosoft’s Chief Scientist, Ron defines and structures Macrosoft’s path forward. Ron's focus on new technologies and products, such as Cloud, Big Data, and AI/ML/WFP. Ron has a Ph.D. in Theoretical Physics from New York University and worked in physics for over a decade at Yale University, The Fusion Energy Institute in Princeton, New Jersey, and at Argonne National Laboratory.

Ron also worked at Bell Laboratories in Murray Hill, New Jersey., where he managed a group on Big Data. Ron's work focused around the early work on neural networks. Ron has a career-long passion in ultra-large-scale data processing and analysis including predictive analytics, data mining, machine learning and deep learning.

Recent Blogs

The Microsoft Power Platform Tool Set: A ‘Down-to-Earth’ Primer
Read Blog
Microsoft Power Automate Desktop: Capabilities and Instructive Scenarios on How to Use It
Read Blog
Automated Monitoring of Foreign Exchange Rates using RPA
Read Blog
How Training and Automation made Macrosoft the Best Quadient Inspire Partner
Read Blog
TOP