Locus’ cloud-based environmental data management and EHS compliance software solutions are built on the cloud— read more about how we use the cloud to help our customers.

Artificial Intelligence and Environmental Compliance–Revisited–Part 2: IoT

More recently, big data has become more closely tied to IoT-generated streaming datasets such as Continued Air Emission Measurements (CEMS), real-time remote control and monitoring of treatment systems, water quality monitoring instrumentation, wireless sensors, and other types of wearable mobile devices. Add digitized historical records to this data streaming, and you end up with a deluge of data. (To learn more about big data and IoT trends in the EHS industry, please read this article: Keeping the Pulse on the Planet using Big Data.) 

In the 1989 Hazardous Data Explosionarticle that I mentioned earlier, we first identified the limitation of relational database technology in interpreting data and the importance that IoT (automation as it was called at the time) and AI were going to play in the EHS industry. We wrote: 

“It seems unavoidable that new or improved automated data processing techniques will be needed as the hazardous waste industry evolves. Automation (read IoT) can provide tools that help shorten the time it takes to obtain specific test results, extract the most significant findings, produce reports and display information graphically,” 

IoT - Internet of Things

We also claimed that “expert systems” (a piece of software programmed using artificial intelligence (AI) techniques. Such systems use databases of expert knowledge to offer advice or make decisions.) and AI could be possible solutions—technologies that have been a long time coming but still have a promising future in the context of big data. 

“Currently used in other technical fields, expert systems employ methods of artificial intelligence for interpreting and processing large bodies of information.” 

Although “expert systems” as a backbone for AI did not materialize as it was originally envisioned by researches, it was a necessary step that was needed to use big data to fulfil the purpose of an “expert”. 

AI can be harnessed in a wide range of EHS compliance activities and situations to contribute to managing environmental impacts and climate change. Some examples of application include AI-infused permit management, AI-based permit interpretation and response to regulatory agencies, precision sampling, predicting natural attenuation of chemicals in water, managing sustainable supply chains, automating environmental monitoring and enforcement, and enhanced sampling and analysis based on real-time weather forecasts. 

Parts one, three, and four of this blog series complete the overview of Big Data, IoT, AI and multi-tenancy. We look forward to feedback on our ideas and are interested in hearing where others see the future of AI in EHS software – contact us for more discussion or ideas!

Contact us to learn more about Locus uses IoT and AI

    Name

    Company Email

    Phone

    Tell us about your company's needs

    Locus is committed to preserving your privacy.

    Artificial Intelligence and Environmental Compliance–Revisited

    On 12 April 2019, Locus’ Founder and CEO, Neno Duplan, received the prestigious Carnegie Mellon 2019 CEE (Civil and Environmental Engineering) Distinguished Alumni Award for outstanding accomplishments at Locus Technologies. In light of this recognition, Locus decided to dig into our blog vault, share a series of visionary blogs crafted by our Founder in 2016. These ideas are as timely and relevant today as they were three years ago, and hearken to his formative years at Carnegie Mellon, which formed the foundation for the current success of Locus Technologies as top innovator in the water and EHS compliance space.

    Artificial Intelligence (AI) for Better EHS Compliance (original blog from 2016)

    It is funny how a single acronym can take you back in time. A few weeks ago when I watched 60 Minutes’ segment on AI (Artificial Intelligence) research conducted at Carnegie Mellon University, I was taken back to the time when I was a graduate student at CMU and a member of the AI research team for geotechnical engineering. Readers who missed this program on October 9, 2016, can access it online.

    Fast forward thirty plus years and AI is finally ready for prime time television and a prominent place among the disruptive technologies that have so shaken our businesses and society. This 60 Minutes story prompted me to review the progress that has occurred in the field of AI technology, why it took so long to come to fruition, and the likely impact it will have in my field of environmental and sustainability management. I discuss these topics below. I also describe the steps that we at Locus have taken to put our customers in the position to capitalize on this exciting (but not that new) technology.

    What I could not have predicted when I was at Carnegie Mellon is that AI was going to take a long time to mature–almost the full span of one’s professional career. The reasons for this are multiple, the main one being that several other technologies were absent or needed to mature before the promises of AI could be realized. These are now in place. Before I dive into AI and its potential impact on the EHS space, let me touch on these “other” major (disruptive) technologies without which AI would not be possible today: SaaS, Big Data, and IoT (Internet of Things).

    Locus Artificial Intelligence

    As standalone technologies, each of these has brought about profound changes in both the corporate and consumer worlds. However, these impacts are small when compared to the impact all three of these will have when combined and interwoven with AI in the years to come. We are only in the earliest stages of the AI computing revolution that has been so long in the coming.

    I have written extensively about SaaS, Big Data, and IoT over the last several decades. All these technologies have been an integral part of Locus’ SaaS offering for many years now, and they have proven their usefulness by rewarding Locus with contracts from major Fortune 500 companies and the US government. Let me quickly review these before I dive into AI (as AI without them is not a commercially viable technology).


    Big Data

    Massive quantities of new information from monitoring devices, sensors, treatment systems controls and monitoring, and customer legacy databases are now pouring into companies EHS departments with few tools to analyze them on arrival. Some of the data is old information that is newly digitized, such as analytical chemistry records, but other information like streaming of monitoring wireless and wired sensor data is entirely new. At this point, most of these data streams are highly balkanized as most companies lack a single system of record to accommodate them. However, that is all about to change.

    As a graduate student at Carnegie Mellon in the early eighties, I was involved with the exciting R&D project of architecting and building the first AI-based Expert System for subsurface site characterization, not an easy task even by today’s standards and technology. AI technology at the time was in its infancy, but we were able to build a prototype system for geotechnical site characterization, to provide advice on data interpretation and on inferring depositional geometry and engineering properties of subsurface geology with a limited amount of data points. The other components of the research included a relational database to store the site data, graphics to produce “alternative stratigraphic images” and network workstations to carry out the numerical and algorithmic processing. All of this transpired before the onset of the internet revolution and before any acronyms like SaaS, AI, or IoT had entered our vocabulary. This early research led to the development of a set of commercial tools and technological improvements and ultimately to the formation of Locus Technologies in 1997.

    Part of this early research included management of big data, which is necessary for any AI undertaking. As a continuation of this work at Carnegie Mellon, Dr. Greg Buckle and I published an article in 1989 about the challenges of managing massive amounts of data generated from testing and long-term monitoring of environmental projects. This was at a time when spreadsheets and paper documents were king, and relational databases were little used for storing environmental data.

    The article, “Hazardous Data Explosion,“ published in the December 1989 issue of the ASCE Civil Engineering Magazine, was among the first of its kind to discuss the upcoming Big Data boom within the environmental space and placed us securely at the forefront of the big data craze. This article was followed by a sequel article in the same magazine in 1992, titled “Taming Environmental Data,“ that described the first prototype solution to managing environmental data using relational database technology. In the intervening years, this prototype eventually became the basis of the industry’s first multi-tenant SaaS system for environmental information management.

    Locus - Big Data - IoT - AI

    Today, the term big data has become a staple across various industries to describe the enormity and complexity of datasets that need to be captured, stored, analyzed, visualized, and reported. Although the concept may have gained public popularity relatively recently, big data has been a formidable fixture in the EHS industry for decades. Initially, big data in EHS space was almost entirely associated with the results of analytical, geotechnical, and field testing of water, groundwater, soil, and air samples in the field and laboratory. Locus’ launch of its Internet-based Environmental Information Management (EIM) system in 1999 was intended to provide companies not only with a repository to store such data, but also with the means to upload such data into the cloud and the tools to analyze, organize, and report on these data.

    In the future, companies that wish to remain competitive will have no choice but bring together their streams of (seemingly) unrelated and often siloed big data into systems such as EIM that allow them to evaluate and assess their environmental data with advanced analytics capabilities. Big data coupled with intelligent databases can offer real-time feedback for EHS compliance managers who can better track and offset company risks. Without the big data revolution, there would be no coming AI revolution.


    AI and Water Management – Looking Ahead

    There has been much talk about how artificial intelligence (AI) will affect various aspects of our lives, but little has been said to date about how the technology can help to make water quality management better. The recent growth in AI spells a big opportunity for water quality management. There is enormous potential for AI to be an essential tool for water management and decoupling water and climate change issues.

    Two disruptive megatrends of digital transformation and decarbonization of economy could come together in the future. AI could make a significant dent in global greenhouse gas (GHG) emissions by merely providing better tools to manage water. The vast majority of energy consumption is wasted on water treatment and movement. AI can help optimize both.

    AI is a collective term for technologies that can sense their environment, think, learn, and take action in response to what they’re detecting and their objectives. Applications can range from automation of routine tasks like sampling and analyses of water samples to augmenting human decision-making and beyond to automation of water treatment systems and discovery – vast amounts of data to spot, and act on patterns, which are beyond our current capabilities.

    Applying AI in water resource prediction, management and monitoring can help to ameliorate the global water crisis by reducing or eliminating waste, as well as lowering costs and lessening environmental impacts.

    Parts two, three, and four of this blog series complete the overview of Big Data, Iot, AI and multi-tenancy. We look forward to feedback on our ideas and are interested in hearing where others see the future of AI in EHS software – contact us for more discussion or ideas!

    Contact us to learn more about Locus uses IoT and AI

      Name

      Company Email

      Phone

      Tell us about your company's needs

      Locus is committed to preserving your privacy.

      EHS Digital Transformation: Managing Drinking Water Quality Data and Compliance: CCR in the Cloud

      In most industrialized cities around the world, drinking water is readily available and safe. Safeguarding groundwater (aquifers), streams, rivers, reservoirs, and lakes is crucial to continue delivering clean water on the tap. So is testing and validated water quality data. There are several aspects of drinking water quality that is of concern in the United States, including Cryptosporidium, disinfection by-products, lead, perchlorates, and pharmaceutical substances.

      Mobile - Managing Drinking Water Quality Data and Compliance

      Recent headlines about water quality issues in cities like Flint, Pittsburgh, Asheville, or Rome and Capetown are motivating consumers to ask more questions about their water quality. Albuquerque’s groundwater is becoming seriously depleted; Fresno’s groundwater is highly susceptible to contamination; In Atlanta, Chicago, Detroit, Houston, Los Angeles, New Orleans, Newark, Philadelphia, Phoenix, San Diego and Washington, D.C., source water is threatened by runoff and industrial or sewage contamination; Water supplies in Baltimore, Fresno, Los Angeles, New Orleans, San Diego, and several other cities are vulnerable to agricultural pollution containing nitrogen, pesticides or sediment.

      Drinking water supply

      Locus Technologies IoT Monitoring. Connected at all times.

      In most cities in the US, drinking water quality is in conformity with the norms of the Safe Drinking Water Act, which requires EPA to set Maximum Contaminant Levels (MCL) for potential pollutants. In addition, the EPA’s Consumer Confidence Report (CCR) Rule of 1998 requires most public water suppliers to provide consumer confidence reports, also known as annual water quality reports, to their customers. Each year by July 1 anyone connected to a public water system should receive in the mail an annual water quality report that tells where water in a specific locality comes from and what’s in it. Locus EIM automates this reporting and allows utilities to be transparent by publishing CCR online in real time so that consumers have access to their CCR at all times. Consumers can also find out about these local reports on a map provided by EPA.

      Utilities must maintain good water quality records and manage them in a secure database with built-in alerts for any outliers so that responsible water quality managers can react quickly when there is exceedance of MCL or another regulatory limit.

      [sc_button link=”https://www.locustec.com/applications/industry/water-utilities/” text=”Learn more about our water solutions” link_target=”_self” background_color=”52a6ea” centered=”1″]

       

      Locus Technologies receives the prestigious EBJ Award for innovation and growth for 13 consecutive years

      Environmental Business Journal (EBJ) recognized the firm for growth and innovation in the field of Information Technology, customer diversification and IoT integration

      MOUNTAIN VIEW, Calif., 5 March 2019 — Locus Technologies, a leading provider of multi-tenant, SaaS-based EHS software, was awarded its 13th consecutive award by Environmental Business Journal (EBJ) for growth and innovation in the field of Information Technology. EBJ is a business research publication providing high-value strategic business intelligence to the environmental industry. Locus received the award for its 2018 customer diversification along with pioneering IoT integration for its various platforms and taking a leadership role in water quality management software for water utilities. Locus’ flagship products Locus EIM, Locus Platform and Locus Mobile continue to resonate with the marketplace and attract a wide range of forward thinking customers who rely on Locus software to provide innovative, secure and scalable applications to solve environmental compliance challenges.

      “Locus continues to influence the industry with its forward-thinking product set, pure SaaS architecture, and eye for customer needs,” said Grant Ferrier, president of Environmental Business International Inc. (EBI), publisher of Environmental Business Journal.

      “We are honored to receive the EBJ Information Technology award once again, and we shall continue to design robust solutions to meet diverse and complex EHS challenges with innovative cloud and mobile-based applications,” said Wes Hawthorne, President of Locus Technologies.

      EHS software switch to multi-tenancy: Too late to switch for many vendors

      The announcements by several EHS software vendors this Fall caught my attention. After offering their software on-premisses for over a decade, suddenly many are discovering and planning to introduce multi-tenant Software-as-a-Service (SaaS) while promising to continue to maintain their current on-premises or single tenant offerings. In essence, they are introducing multi-tenancy as if it were a new version of their software. This plan is not going to work! Let me explain why.

      Most public announcements begin something like this: In the next several years we plan to expand our software offerings and offer our customers the option to move from their current on-premises solution to the cloud. However, is this even possible? What they consider the “cloud” may not be a true multi-tenant cloud. That train departed years ago, and most of the current EHS software vendors missed it. While multi-tenancy has been a game changer in the tech industry, many are uncertain of exactly what makes an application “multi-tenant” or why it matters.

      Locus Multi-Tenant Software

      There is a considerable degree of (intended) confusion in the EHS software space when it comes to the definition of a real cloud or better said, multi-tenancy. Companies that are considering SaaS solutions for EHS software hear all sorts of things from EHS software vendors hoping to tap into the momentum of cloud computing. Many go as far as saying “sure; we can do multi-tenant, single-tenant, whatever tenant you need!” –anything to win the job. These vendors do not understand the real cloud.

      Multi-tenancy is a significant shift in computing and requires an all-new approach to the software architecture and the delivery model from the ground up. It is transformational, and customers who intend to buy the next generation of EHS software should spend the time to understand the differences. More importantly, multi-tenancy is a principle, not a software version or an upgrade. It is not an evolutionary step; instead, it is a revolution in the software delivery model and it matters in the long run for the customer.

      Multi-tenant architecture

      Figure 1: The single-tenant model cannot easily be switched or “upgraded” to multi-tenant. The software architecture does not allow it for easy switch the same way as single family home cannot be “remodeled” to become a multi-tenant highrise. What differentiates multi-tenant application architecture is its effectiveness in achieving the same goal in a scalable and sustainable fashion.

      Can anyone imagine companies like eBay, Salesforce, Google, Workday, or Amazon offering a “single-tenant” solution side by side to their multi-tenant clouds? I argue that any EHS software vendor who offers a single-tenant solution of any type, cannot be a serious contender in multi-tenant SaaS.

      EHS software vendors with on-premise software applications or single-tenant web-enabled offerings are seduced by the seemingly low barriers to entry into the SaaS market with an architecture that leverages virtualization. This approach allows a software company to quickly offer subscription-based services of their legacy product to their initial customers. In the long run, however, this multi-instance approach just won’t scale economically. A recent wave of ownership change of EHS software companies is the best indicator that sold companies became victims of their initial success. A SaaS provider who leverages virtualization puts the long-term viability of the business at risk as more efficient SaaS competitors come to dominate the market.

      Multi-tenant architecture

      Figure 2: Single-tenant requires many more vendor resources. The resource costs are eventually passed to customers. Each upgrade of the application will require each customer to upgrade independently and the ability to implement tenant management tools and tenant-specific customizations is significantly limited. The benefit of multi-tenancy is that instead of 100 copies of the OS, 100 copies of the database, and 100 copies of apps, it has 1 OS, 1 DB and 1 app on the server with significantly less vendor resources required to manage it. And it is those savings that are on a long term passed to customers.

      Multi-tenant architecture

      Figure 3: Multi-tenant model requires less resources and easier (and rolling) upgrades (i.e. no version number necessary). Only one software instance and hardware stack for multiple tenants. All customers are always on the latest version of software. Locus Technologies figured this out in 1999. And they contribute their phenomenal success since then exactly to the multi-tenancy. They could scale up infinitely without adding proportional cost. Others cannot.

      Multi-tenant architecture

      Figure 4: “Can’t we create a separate stack for just this one customer? I promise it’s just this one…” Even a single installation for one “special” customer, breaks the multi-tenant model. Don’t do it.

      I would also add that single-tenant (hybrid) cloud applications are worse than on-premise installment. Why? Because they are fake clouds. In single tenancy, each customer has his or her independent database and instance of the software. These instances may reside on the same or different servers. In this model, a customer is, in fact, outsourcing maintenance of their application (software and hardware) to a vendor (or their consultant) that is not likely equipped to perform these tasks. No single vendor in the EHS software industry is large enough to undertake maintenance of the single-tenant infrastructure on behalf of their customers regardless of how inexpensive hardware or software virtualization may be. Even if they offer their hosting on Microsoft Azure Cloud or Amazon Web Services (AWS), they still cannot guarantee multi-tenancy as these solutions address only hardware challenges.

      The Economist magazine in 2004 described it: “Those forerunners also promised a software revolution by hosting the software applications of companies. But they failed because they simply recreated each client’s complex and unwieldy datacentre in their own basements, and never overcame the old problems of installation and integration with other software. With each new customer, the old ASPs had, in effect, to build another datacenter; there were few economies of scale.”

      To improve their position in a shifting marketplace, on-premises EHS vendors have found a way to market their solutions as “cloud-based” when they are not backed by the fundamental principle of what that means. Considering the large investment that is associated with the purchase or licensing of EHS software, it is critical for customers to be able to tell a true cloud product from a fake one. But how can you spot a fake?

      Just ask the EHS software vendor these four questions:

      1. Do you support both single-tenant and multi-tenant deployments of your software?
      2. Does your software have version numbers? 
      3. Do you charge for upgrades?
      4. Can we install your software on our infrastructure?

      If the answer to any of these questions is yes, the vendor is not committed to only multi-tenant architecture, and you should not move to their “cloud.”

      Multi-tenancy is the only proven SaaS delivery architecture that eliminates many of the problems created by the traditional software licensing and upgrade model where software is installed as a single-tenant application on a customer’s premises or at a customer’s or vendor’s data center. In contrast, in multi-tenancy, all customers access the same software on one or a set of linked servers.

      Multi-tenancy requires a new architectural approach. Companies have to develop applications from the ground up for multi-tenancy. Once companies commit their limited financial resources to one architecture, it becomes nearly impossible for them to switch to the multi-tenancy model, no matter how many resources they have available. Moreover, for this reason, I am skeptical that many current vendors will be able to make a switch to multi-tenancy fast enough.

      A vendor who is invested in on-premise, hosted, and hybrid models cannot commit to providing all the benefits of a true SaaS model due to conflicting revenue models. Their resources are going to be spread thin supporting multiple versions rather than driving innovation. Additionally, if the vendor makes the majority of their revenue selling on-premise software, it will be very difficult for them to fully commit to a true SaaS solution since the majority of their resources will be allocated to supporting the on-premise software.

      And if they suddenly introduce a “multi-tenant” model (after selling an on-premises version for 10+ years) who in the world would want to migrate to that experimental cloud without putting the contract out to bid to explore a switch to well established and market-tested true multi-tenant providers? Even Google and Microsoft are playing a catch-up game with Amazon’s AWS when it comes to cloud hosting business. The first mover advantage when it comes to multi-tenancy is a huge advantage for any vendor.

      In summary, an EHS software vendor can be either truly multi-tenant or not. If a vendor has installed their software on somebody’s else hardware and runs multiple instances of that software (even if the code base is the same) they are not and will never be true multi-tenant.

      Multi-tenant architecture

      Figure 5: Where do you want your software to reside? In multi-tenant or single -tenant infrastructure? If multi-tenancy is attempted on old infrastructure or legacy application upgrade watch out. After vendor built the first few floors of that skyscraper, there is no easy way to replace the foundation. You will be lucky if they end up like the tower of Pisa or Millennium tower in San Francisco. To keep the tower alive they will have to do constant underpinning of the foundation and restrict access to the structure. And you, the customer, will pay for it. That is what many customers of single-tenant EHS vendors are facing today.

      Therefore, when considering a SaaS solution, make sure that the vendor is a true SaaS vendor who is solely committed to the multi-tenant SaaS delivery model and has invested in a true multi-tenant platform. This is the only way to reap all the benefits that a SaaS model has to offer.

      PennJersey Environmental Consulting selects Locus EIM SaaS-based software for its environmental compliance data

      Locus’ EIM Solution Will Streamline PennJersey Consulting’s Entire Environmental Laboratory Data Validation and Reporting

      MOUNTAIN VIEW, Calif., 11 December 2018 — Locus Technologies, (Locus), the industry leader in EHS, sustainability, and compliance management software, is pleased to announce that PennJersey Environmental Consulting (PennJersey), a leading environmental site assessment and remediation firm located in Milford, New Jersey, has selected Locus EIM SaaS-based software to more efficiently track analytical data, automate its field data collection, laboratory analyses, and overall enterprise data consolidation for its clients.

      “With Locus EIM, our professionals and staff will be positioned to manage our laboratory data more efficiently.  We were especially drawn to the ability to automate the laboratory data validation to assure the quality and usability of the data. EIM will provide our clients greater efficiency and allow us to focus on providing the timely and cost-effective solutions to their assessment and remediation challenges,” said Rodger Ferguson, President of PennJersey.

      “Our deep understanding of the EHS compliance market enables us to quickly address environmental mandates, such as PennJersey’s tracking and management of soil, air, and groundwater data, with precision,” said Neno Duplan, Founder and CEO of Locus.  “Locus EIM can indicate what levels of target compounds are in the soil, air, or water samples, how the data are trending, and provide real-time alerts to abnormalities.  Overall, our solution ensures better monitoring, real-time analysis, aggregations and reporting of data that leverages a modern SaaS-based platform.”

      Blockchain: aggregate emissions reporting

      In the next few years, an opportunity exists to make significant advances in how we monitor and manage environmental emissions to the air, soil, and water, potentially resulting in significant disruptions in current approaches. Currently, industries and commercial establishments monitor their emissions and submit reports on a regular basis, often as frequently as quarterly, to federal and state agencies to demonstrate they are meeting regulatory requirements. However, no one on the generating or receiving end of these data dumps and reports is aggregating these emissions to create a more composite, inclusive picture of emissions across sources or media. The reason is that emissions of different types and to different media are reported to separate regulatory entities that, in general, do not interact or talk to one another. And although there are significant potential benefits to both generators and regulators in reviewing integrated environmental data sets, our traditional methods of storing and sharing this information make such integrations a hugely difficult effort.

      Only by integrating all available data can we begin to (1) assess local, regional, and ultimately the global impacts of these emissions, and (2) identify net improvements to our environmental practices that are only apparent when looking at the combined, interconnected body of collected data. Blockchain enables the integration of these data sets for quick, yet comprehensive “big picture” assessments.

      Blockchain technology is a highly disruptive technology that offers an efficient way of storing records (called blocks) which are linked using cryptography. While still in its infancy, blockchain promises to change the world as we know it, much like the internet did after its introduction in 1991. Today, the technology is most widely associated with digital currencies and money transfers. In time, however, blockchain technology will not only shift the way we use the internet, but it will also revolutionize the global economy and almost all transactional business that relies on an intermediary.

      One Environment, Health, and Safety and Sustainability  (EHS+S) sector well positioned to benefit from blockchain technology is emissions monitoring and reporting. I reported more on the technology and its impact on EHS space here.

      Environmental monitoring current practice

      Presently, companies with emissions monitor these following regulatory requirements, input the resulting data into a database or spreadsheet, perform emissions calculations on the entered readings, and then report the results of these calculations to regulators. The entire focus of this process is to (1) determine whether emissions of a single chemical or chemicals exceed prescribed levels and (2) evaluate the effect of these discharges on the media to which the compounds have been introduced by the polluting industry or other sources. There is no suitable software technology or mechanism to look at aggregate emissions across geographical areas or sectors or how emissions of one type interact with emissions of an entirely different kind. Examining such interactions could be far more critical than monitoring and assessing the impacts on human health and the environment of single parameter emissions to only one media, and may reveal new opportunities for optimizing our EHS+S practices for reduced cost with similar or improved performance.

      Aggregate emissions

      To take a hypothetical scenario, consider the possible consequential damages when two incompatible streams of chemicals or waste mix to create even worse chemicals as a result of their chemical reaction.  EPA has only recently started looking into these type of scenarios. Its Envirofacts databases allow the public to retrieve information from multiple sources of Envirofacts’ System Data relevant to your area of interest. However, each database is a separate silo of information (Figure 1). The next step that ought to be taken is to assess and as needed, report on the possible interaction of incompatible emission sources that are nearby, but are independently monitored and stored in disconnected databases (see Figure 2 below).

      EPA Envirofacts 1

      Figure 1: EPA Envirofacts databases allow the public to retrieve information from multiple sources, but only one source at time and disconnected from each other.

      Most everyone taking prescription medicines comes to understand that interactions between drugs are quite common. Imagine something similar to the interaction of drugs in your body happening on a much larger scale in the environment. One does not have to imagine. EPA recently imposed the highest environmental fine ever at the 2,530-acre Eastern Michaud Flats Contamination Superfund site near Pocatello, Idaho. Two adjacent on-site phosphate ore processing facilities, the FMC Corporation and the J.R. Simplot Company, began operations at the site in the 1940s. The J.R. Simplot facility produces solid and liquid fertilizers using phosphate ore, sulfur, air and natural gas. The FMC plant is North America’s largest producer of elemental phosphorus which is used in a variety of products from cleaning compounds to foods.

      Operations at these plants have independently contaminated both the groundwater and soil with hazardous chemicals. Both plants have received numerous environmental violations, many of which were settled with the EPA. Each of the sites has its environmental ills (and fines), but the more significant environmental problem is a combined regional plume. Everyone knows that acids and metals do not play well together. Sulphuric acid from the J.R. Simplot operation has leaked from surface impoundments into the groundwater and, on its way downstream, has leached all kinds of toxic metals from the FMC site, creating a highly poisonous plume of contaminants. An accurate assessment of the environmental disaster that exists in this area requires that the environmental impact of the two plants be examined in toto. Blockchain-based monitoring technology would allow both the public and regulators to see the resultant subsurface commingled plume and possibly pave the way to a more comprehensive remedy.

      Issues involving contamination of multiple media have also arisen at sites where discharges of volatile organic compounds or VOCs have occurred. In Silicon Valley, where I live, many engineering consultants have made their living chasing plumes of VOC chemicals (e.g., TCE) and then, when deemed appropriate, have installed various groundwater treatment plants tucked in the back of parking lots of companies like Google or HP to ameliorate this contamination. Santa Clara, the central county in Silicon Valley, is home to more Superfund sites than any other county in the United States.

      The process is analogous to rinsing detergent from a sponge. After many rinses, it still seems to have more in it. It is an endless process with little environmental benefit. Has anyone looked at the additional impact of the high energy demand for treatment systems that have minimal effect on improving groundwater, but can contribute significant CO2 equivalents to the atmosphere?

      With blockchain technology, we could simultaneously measure the positive effect of the treatment plant removing contaminants from water and the negative impact that this same plant produces by contributing to the CO2 emissions. Quantities of removed chemicals over time could be plotted in real time vs. CO2 emissions produced resulting from high energy usage of the treatment plant. This would allow companies operating treatment plants and regulators overseeing them to determine at what point in time continued treatment could be harming, not helping the environment. It is these type of analyses that would benefit society and help with the decision to shut down a remediation process when diminishing returns of the treatment system are reached.

      EPA Envirofacts 2

      Figure 2: Interaction of incompatible emission sources is better managed if emissions are aggregated than if independently monitored and stored in disconnected databases.

      How would blockchain technology help in a scenario like this? Chemical removal rate would be tracked in one block (of the chain) and CO2 emissions in another. Owner and regulator would agree on the formula to determine when the treatment process ceases to produce a significant environmental benefit. At this point, the system would be shut down. All of this would be monitored and measured in real time, and more importantly, it would be transparent to the owner, regulator, and the public.

      Emissions measures should be preemptive, not reactive

      When you think about emissions, they are generally (except incidents and accidents)  operating problems that can be managed and optimized before discharges even happen. It is to the benefit of companies to do it this way. Every process that has an exhaust or smokestack for dispersing air emissions or pipelines for discharging liquids to surface receptors or water bodies could be managed to reduce harmful emissions coming out the system regardless of regulatory prescribed permissible levels. As an organization with a legacy environmental site knows, it is far more cost-effective to eliminate the original cause of emission than to spend decades of effort to remediate after the fact.

      Unfortunately, many businesses are currently not genuinely looking at the aggregated data they collect about their emissions, wastewater, and energy use alongside their operational metrics. Current practices for EHS+S data management only allow for very simplistic comparison of normalized indicators between these disparate data sets.  But it would benefit these operators to gather, aggregate and analyze data, and then make better, more cost-effective decisions as part of their risk-management protocols, while still maintaining their environmental compliance requirements. Blockchain technology allows for review of more detailed data when making decisions with aggregated data sources so that managers can look beyond the simple normalized performance indicators. For example, many organizations only review their environmental and sustainability performance on an annual basis, mainly because the current tools to aggregate this data require them to be evaluated on a consistent time frame, and there is a significant investment in bringing all of the relevant data together. But through blockchain technology, the data maintain their connection at every level.  This allows for trend evaluation at other time frames not currently being examined. So if some short-term operational practice causes a spike in emissions, that issue can be identified and resolved immediately, rather than waiting for the end of the year, when the emissions have already happened, and the effect may not even be apparent when averaged out on an annual time frame. Then, even looking beyond the facility or organization, blockchain also allows for data aggregation across industry, region, and country, so that we will be in a better position to forecast the future and assess the viability of different measures to ameliorate the problems confronting us.

      A bigger picture

      There is a growing need for companies to bring together information from their vast disconnected databases, single tenant clouds, and spreadsheets, and then mine the data they collect from these sources. In a decade or so, planet Earth may be a meshed grid of static sensors coupled with movable ones installed on people, animals (yes animals roaming in the wild), transportation devices, and other moving objects to collect data in real time. The conversation about the environmental landscape has evolved drastically over the last 50 years as we continue to understand the extent to which human activity has affected the planet. Companies and society need a collective and holistic understanding of the problems we face.

      The only way to understand the full picture, and in turn to act meaningfully on a global level, is for all individuals and companies to understand the impact of their activities. It’s impossible to mitigate the net risks and effects of these activities on the planet when we have not fully assembled the data to characterize the problem and understand the full picture. Blockchain technology offers the best path forward, making it possible for environmental data be integrated at multiple levels. Any coordinated effort of this magnitude will be years in the making, but every journey starts with a first step. There are two impediments to institute a change like this: technology (until recently, blockchain did not exist) and a government with the initiative to require such technology. Just as was the case with the internet revolution of the nineties, the rate of progress in technology is surpassing politicians’ ability to come to grips with its impact on society.

      So far, there have been no imposed data exchange standards; a prerequisite for a broad data exchange, land for implementation of blockchain technology.  But in the meantime, progressive organizations will want to start taking advantage of this technology to look at their operations and make more informed EHS+S decisions.

      Looking forward with blockchain technology

      Perhaps blockchain technology is not ready for prime time. Some may argue that it creates a secondary problem of additional energy consumption much like water treatment systems described earlier. This is a theme that is advocated by some media outlets and blockchain skeptics who argue that the computer algorithms require significant amounts of electricity to power the servers on which they run. Estimates of blockchain’s soaring energy use are likely overstating the electric power used as the current debate on power consumption is not backed by hard data. When it comes to technology, history has consistently shown that the cost will always decrease, and the impact will still increase over time. It is inevitable that blockchain technology will become more accessible with reduced infrastructure over the next few years.

      Blockchain IoT Decentralization

      Blockchain could completely change how companies run their businesses and present new opportunities far beyond sustainability and environmental emissions management.

      We are living in a world where companies and governmental agencies are not able to comprehensively analyze  EHS+S information efficiently. Using blockchain technology will allow organizations to track, store, rollup, gain insights into, and also share their data with other interested parties as needed. It has the potential to put accurate and verifiable information into the hands of companies and regulating agencies more quickly. To make better progress on how we use EHS+S information, regulators will need to find ways that reward positive and proactive behaviors. We are not going to solve these issues by fining emitters until they behave. Blockchain technology can help us move us away from the punitive approach and toward a more collaborative one by assisting companies to reduce their emissions while lowering their operating costs at the same time. Social sharing elements may also play a role here, giving companies that benefit from the fruits of blockchain technology a valuable marketing and PR advantage over those who do not adopt this technology, and as such, lag behind in their progress on environmental issues.

      12 ways commercial SaaS can save your complex environmental data (part 4/4)

      Continued from Part 3

      Complex data - Data stewardship11) Databases are simply more capable when it comes to data stewardship

      Data management is a broad term that includes the range of activities that we have discussed elsewhere in this blog series, including sample planning and collecting, inputting data, uploading EDDs (Electronic Data Deliverables), and analyzing and reporting environmental data and information.

      However, the full scope of data stewardship is even broader than this, including necessary things like knowing where your data is located and knowing the quality of the data used in your regulatory reports.

      Here at Locus, we have had new customers come to us with some incredible horror stories:

      • data “held hostage” by third parties
      • data lost over time with multiple contractor changes
      • data stored in email or file cabinets
      • data in scattered piles of PDF documents or hard copies (very typical for boring logs)
      • labs unable to generate fresh EDDs due to laboratory LIMS system changes or industry consolidation

      These are just some of the latest examples we have encountered.  We are constantly surprised and concerned at the variety of ways that organizations can unwittingly put their critical data at risk.

      The key to effective data stewardship is to know where it is, know its quality, and have uninterrupted access to it.  This is something that Excel can’t offer, and it’s also something a hodgepodge of spreadsheets, emailed PDF files, stacks of hard copy boring logs in multiple offices, and custom-built databases simply can’t do.


      Complex data - Software quality assurance12) Databases are more supportive of software quality assurance practices

      Quality assurance is a popular topic of discussion, but few people consider it in terms of the configurations (behind the scenes code) that people add to popular off-the-shelf programs such as Excel or Access.  Of course, these programs go through rigorous quality assurance testing before being released to the public to ensure they perform as expected.  After all, no one questions that Excel can perform math correctly.

      However, what is often not considered, are the macros, custom functions, and calculations that are often added to spreadsheets when deployed for managing environmental data and other tasks.

      Here at Locus, we have yet to encounter one Excel spreadsheet or Access database from a customer that has been documented, testedand comes with clear user instructions.   We also have never encountered anyone that has never made errors in Excel by picking the wrong cells for a formula.

      You would avoid these types of oversights and lax QA protocols with commercial software that relies on expert functionality for its business.  For example, if Locus EIM did not perform proper calculations (repeatedly) or load data properly (repeatedly), the product would not be successful in the marketplace, and we wouldn’t have thousands of users who trust and use our software every day.  This level of quality assurance is simply not found in user-configured, ad hoc “databases” built-inExcel or Access.

      As more environmental sites become embroiled in litigation, or are in the process of making health and risk clean up decisions, the importance of data quality assurance cannot be ignored.  Water utilities that are charged with providing clean safe drinking water to the public can’t rely on ad hoc Excel or Access systems to analyze such critical data.


      Organize complex data in SaaS databaseCustom databases built in-house vs. commercial software

      If you’re serious about rethinking your environmental data management system and finally ditching your spreadsheets for a more mature and secure solution, you might be considering the advantages of having an in-house team or a contractor build a custom database system for you with Access or another widely available tool.

      After all, only you know the idiosyncrasies that your organization deals with, right?  You can have your developers tailor your system to fit your needs exactly.

      Moreover, with a custom solution, you can make sure that it’s integrated with your organization’s other systems and processes, like document management and invoicing.

      Some of this might be true, but let’s take a closer look—and consider the tradeoffs.

      Is your organization as unique as you think?

      Finding a differentiator for R&D success in cloud SaaS applicationsEvery organization is different, especially if we’re comparing organizations and businesses across different industries. Water utilities face an entirely different set of challenges than a multinational oil and gas corporation. Despite these differences,  diverse organizations share some remarkable similarities when it comes to managing environmental data, and in most cases, you’re not the only environmental professional who has experienced most of the challenges your organization has faced.

      A commercial software vendor with customers in a wide variety of industries naturally collects an aggregated body of knowledge about the environmental data management needs (and quirks) of their customers. By adopting existing commercial software, your organization can benefit from the wisdom of this crowd, getting access to functionality and modules that can streamline your processes in ways you couldn’t imagine (or afford to develop).

      On top of this, commercial software solutions that have been around for a while usually have pretty good support for various API integrations of commonly-used systems, or they can easily build the integrations into their solution (usually for a small fee). Attempting to build these integrations into a custom, in-house solution can lead to astronomical costs and unforeseen complications that often can’t be accurately estimated until the work is well underway.

      Can your in-house resources fully examine your business processes and accurately identify your needs?

      Locus Platform ConfigurabilityCommercial software vendors are in the business of translating real customer needs into successful software products.  As an environmental professional, you probably have a good understanding of your business processes, but do you trust yourself and your development team to find and implement the most efficient, effective, and scalable solution for managing your ever-increasing amounts of data?

      “Off-the-shelf” can sometimes be a misnomer.  Many commercial vendors nowadays have learned to build their platforms to be configurable and customizable, to better accommodate the wide variety of customer industries and organization-specific needs.  Don’t be afraid to reach out to a few vendors to see what they can offer. Consider a vendor that has experienced domain experts, that have been in your shoes and are motivated to help you solve your problems, and deliver the solution you need.

      A good vendor will ask many questions about your business processes, your current system, and your pain points.  You might be surprised at how easy-to-configure and flexible “off-the-shelf” systems can be.

      Think forward—could today’s “bells and whistles” become tomorrow’s critical features?

      Locus GIS+Are GIS and mobile part of your current environmental data management process?  If so, you will absolutely want to have them integrated with any database solution. Otherwise you’ll be dealing with a mess of duplicate and out-of-date data all over again.  Building integrations with these complex systems can be just as challenging as building the database management system itself.

      A robust commercial software solution comes with these features built-in.

      Let’s go even further—have you ever thought about how automation, the Internet of Things, or artificial intelligence could impact your business processes in 5 or 10 years (or sooner)?  Commercial software vendors often have the resources and the incentives to explore new frontiers of technology and stay on the cutting-edge of their market.  When your peers or competitors start integrating these new technologies into their workflows, will your custom system be able to adapt and keep up?


      Hopefully, by this point, we have convinced you of the superiority of database management systems over spreadsheets when it comes to managing environmental data.  Now, it’s time to make some efforts to examine the specific shortcomings of your current system and consider your options.

      Now that you have had the opportunity to consider why SaaS databases allow you to manage your complex data efficiently, make data integration and reporting faster and easier and scale to your requirements, contact Locus Sales at (650) 960-1640 or fill out the contact form below to find out what Locus can do for you.

      12 reasons why commercial SaaS databases are ideal

      Make sure to read the entire series to find out about 12 reasons commercial SaaS databases excel at managing complex environmental data!

      About the author—Gregory Buckle, PhD, Locus Technologies

      Gregory Buckle, PH.D.Dr. Buckle has more than 30 years of experience in the environmental field, most of which have been devoted to the design, development, and implementation of environmental database management systems. When he joined Locus in 1999, he was responsible for building and deploying Locus’ cloud-based EIM software. He was also instrumental in customizing EIM for the water utility industry and developing EIM’s powerful Sample Planning and Data Validation modules. The latest iteration of the Sample Planning module that Dr. Buckle built is currently being used by Los Alamos National Laboratory and San Jose Water Company to plan and schedule thousands of samples per year.


      About the author—Marian Carr, Locus Technologies

      Marian CarrMs. Carr is responsible for managing overall customer solution deployments and customer relationships with Locus’ government accounts. Her career at Locus includes heading the product development team of the award-winning cloud-based environmental ePortal solution as well as maintaining and growing key customer accounts with Locus’ Fortune 100 enterprise deployments. In addition, Ms. Carr was instrumental in driving the growth and adoption of the Locus EIM platform with key federal and water organizations.


       

      Have a question about Locus’ cloud-based environmental software?

        First name

        Last name

        Email address

        Phone number

        Company

        Job title

        Tell us about your company's needs

        Locus is committed to preserving your privacy.

        12 ways commercial SaaS can save your complex environmental data (part 3/4)

        Complex data - Simultaneous usageContinued from Part 2

        6) Simultaneous usage is better supported by databases

        Microsoft Support confirms that it is possible to share an Excel workbook.  Two or more individuals can indeed access the same spreadsheet simultaneously. Edits are even possible:

        You can create a shared workbook and place it on a network location where several people can edit the contents simultaneously… As the owner of the shared workbook, you can manage it by controlling user access to the shared workbook and resolving conflicting changes. When all changes have been incorporated, you can stop sharing the workbook.

        Sharing a spreadsheet may work in a small office or facility with a couple of users, but it certainly is not a viable option when more users need to access, view, and generate reports. This is a task for which databases are far better suited.

        On any given day, for example, Locus EIM supports hundreds of simultaneous users. Some may be inputting form data, while others are loading and checking laboratory EDDs, and still others are creating reports and graphs and viewing data on maps and in tables.  Many of these are very data-intensive processes—yet Locus EIM handles them seamlessly.

        Being able to handle such simultaneous activity is inherent in the designs of relational databases. In contrast, the ability to share an Excel workbook is not a native feature of such software and, as such, is unlikely to meet the needs of most organizations (especially as they evolve and grow).


        Complex data - Processing speed & scalability7) Processing speed, capacity, and scalability is better with databases

        Compared to spreadsheets, databases are the hands-down winners with respect to processing speed and the numbers of records they can store. Higher-end databases can store hundreds of millions of records.  In contrast, spreadsheets with hundreds of thousands of records can bog down and become difficult to manage.

        An underappreciated, yet  the critical difference is that while you’re using a spreadsheet, the entire file is stored in a computer’s random access memory (RAM). In contrast, when using a database, only the dataset that you are currently working with is loaded into RAM.

        To illustrate just how fast a powerful database can be, I sent a query to EIM at our secure facility on the opposite coast, asking how many “benzene” records were in one of our larger laboratory results table (N > 4,500,000).  Sitting at a desk here in the hinterlands of Vermont, the result (“number of records = 64773”) came back in less than a second.  I did not even have time to call in the cows for their afternoon milking.

        Because they are both faster and can store more, databases scale far better than spreadsheets.  As such, they can meet both your current and future requirements, no matter how fast the information you are required to store grows over time.


        Complex data - Workflows8) Databases support creating and following complex workflows

        In contrast to spreadsheets, databases support the creation of formal workflows. Let’s consider one example from EIM—its cradle-to-grave sample planning, collection, and tracking process.

        Using EIM’s Sample Planning module, you can:

        • Identify one-time or recurring samples and analyses that need to be collected
        • Transfer information on these planned samples and analyses to Locus Mobile
        • Collect field data
        • Upload field data to EIM (where it is stored in various tables)
        • Generate chains of custody and sample bottle labels (after which the samples are sent to the lab for analysis)
        • (Days or weeks later, labs upload their findings to EIM’s holding table, where they are automatically matched with the previously uploaded field information)
        • Receive notifications that the lab results are now available (additional notifications can be sent if any results are found to exceed a regulatory limit)
        • Track the status of the samples throughout this process with forms that can tell you the status of each planned sample, including whether any results are late or missing
        • Generate relevant reports, maps and charts for internal use or for submittal to the appropriate agency

        Complex data - Workflow

        You simply could not build such a comprehensive and sophisticated workflow in Excel.  Notice we mentioned maps.  Building complex workflows is yet another area where advanced, integrated database management systems shine, especially as they can automatically create GIS-based maps of the results from data housed in the database—without the need (or expense) for ancillary software.


        Complex data - Security9) Databases provide more security than spreadsheets

        Microsoft identifies the following security features available in Excel:

        User-level data protection

        You can remove critical or private data from view by hiding columns and rows of data, and then protect the whole worksheet to control user access to the hidden data. In addition to protecting a worksheet and its elements, you can also lock and unlock cells in a worksheet to prevent other users from unintentionally modifying essential  data.

        File-level security

        At the file level, you can use encryption to prevent unauthorized users from seeing the data. You can also require password entry to open a workbook, or you can secure a workbook by employing a digital signature.

        Restricted access to data

        You can specify user-based permissions to access the data, or set read-only rights that prevent other users who may be able to view the data from making changes to it.

        Perusing the web for postings comparing the features of databases to spreadsheets, you’ll find plenty of accusations that spreadsheets lack security and control features. Clearly, Microsoft’s description of the security features available in Excel shows that this isn’t the case.  However, these security features may not be as robust as Microsoft claims, and they may prove difficult for the average user to implement.

        As Martin Cacace of BoundState Software explains, “Although Excel allows you to protect data with a password and Windows-based permissions, it is extremely delicate and requires a deep understanding of Excel.” Some of these features won’t work if you have people using different operating systems or if you need access from other computers. Even a password protected Excel file is not really secure; there are tools on the Internet that anyone can use to unlock a protected Excel file without knowing the password.”

        Databases offer far more control than spreadsheets over who can access and make changes to data.  As an example, Locus EIM users must have a unique username and password. Users can be assigned to multiple privilege levels, ranging from “administrator” to “guest”.  Customers that require a more fine-grained approach can use “roles” to assign permissions to specific modules, activities, or functionality to users.  Password security is typically robustly designed in commercial databases, and can be configured to require complex passwords, session expiry, and password expirations to match customer IT requirements, something Excel would find challenging. Locus EIM also tracks all users and makes that information available to database admins to provide yet another layer of security for the system.


        Complex data - Data loss & corruption10) Databases are better at preventing data loss and data corruption

        Because of the general lack of controls that exist in most spreadsheets, it is far easier for a user to wreak havoc on  them. One of the most dreaded developments that can occur is associated with the “Sort” function. A user may choose to sort on one or more columns, but not all—resulting in the values in the missed columns not matching up with those in the sorted ones.  Nightmares like this are easily preventable (or are simply not possible) in databases.

        Another advantage of database management systems is their ability to create audit trails, which preserve the original values in separate tables when changes are made to records.  In the event that a user wants to undo some changes (including deletions) that he or she has made to a table, a data administrator can retrieve and restore the original state of the modified or deleted records.  Also importantly, the circumstances of these changes are fully tracked (who, what, when, where), which is a minimum requirement for any quality assurance process.

        Lastly, Excel stores the entire spreadsheet in memory, so if there is a system crash, you will lose everything you have entered or edited since your last save. In contrast, each operation you perform in a database is saved as you complete it. Moreover, most databases have daily backups, and in some cases, maintain an up-to-date copy of the data on a secondary device. Additionally, data is typically backed up in multiple geographic locations to provide even more recovery options in a disaster situation.  Any good commercial database vendor will be happy to share their disaster recovery process because securing and maintaining your data is their most important job. In short, you can rest assured that your valuable data—often gathered over many years at a high cost—will not be lost if it is stored in a DBMS like Locus EIM.


        12 reasons why commercial SaaS databases are ideal

        Make sure to read the entire series to find out about 12 reasons commercial SaaS databases excel at managing complex environmental data!

        About the author—Gregory Buckle, PhD, Locus Technologies

        Gregory Buckle, PH.D.Dr. Buckle has more than 30 years of experience in the environmental field, most of which have been devoted to the design, development, and implementation of environmental database management systems. When he joined Locus in 1999, he was responsible for building and deploying Locus’ cloud-based EIM software. He was also instrumental in customizing EIM for the water utility industry and developing EIM’s powerful Sample Planning and Data Validation modules. The latest iteration of the Sample Planning module that Dr. Buckle built is currently being used by Los Alamos National Laboratory and San Jose Water Company to plan and schedule thousands of samples per year.


        About the author—Marian Carr, Locus Technologies

        Marian CarrMs. Carr is responsible for managing overall customer solution deployments and customer relationships with Locus’ government accounts. Her career at Locus includes heading the product development team of the award-winning cloud-based environmental ePortal solution as well as maintaining and growing key customer accounts with Locus’ Fortune 100 enterprise deployments. In addition, Ms. Carr was instrumental in driving the growth and adoption of the Locus EIM platform with key federal and water organizations.


         

        Have a question about Locus’ cloud-based environmental software?

          First name

          Last name

          Email address

          Phone number

          Company

          Job title

          Tell us about your company's needs

          Locus is committed to preserving your privacy.

          12 ways commercial SaaS can save your complex environmental data (part 2/4)

          Continued from Part 1

          Complex data - Data quality2) Data quality is better with databases

          Since 2002, a dedicated group of Locus employees has been involved with migrating data into EIM from spreadsheets provided to us by customers and their consultants. As such, we have firsthand experience with the types of data quality issues that arise when using spreadsheets for entering and storing environmental data.

          Here is just a small selection of these issues:

          • Locations with multiple variations of the same ID/name (e.g., MW-1, MW-01, MW 1, MW1, etc.)
          • Use of multiple codes for the same entity (e.g., SW and SURFW for surface water samples)
          • Loss of significant figures for numeric data
          • Special characters (such as commas) that may cause cells to break unintentionally over rows when moving data into another application
          • Excel’s frustrating insistence (unless a cell format has been explicitly specified) to convert CAS numbers like “7440-09-7 (Potassium)” into dates (“9/7/7440”)
          • Bogus dates like “November 31” in columns that have do not have date formats applied to them
          • Loss of leading zeros associated with cost codes and projects numbers (e.g., “005241”) that have only numbers in them but must be stored as text fields
          • The inability to enforce uniqueness, leading to duplicate entries
          • Null values in key fields (because entries cannot be marked as required)
          • Hidden rows and/or columns that can cause data to be shifted unintentionally or modified erroneously
          • Bogus numerical values (e.g., “1..3”, “.1.2”) stored in text fields
          • Inconsistent use of lab qualifiers— in some cases, these appear concatenated in the same Excel column (e.g., “10U, <5”) while in other cases they appear in separate columns

          With some planning and discipline, you can avoid some of these problems in Excel. For example, you can create dropdown list boxes to limit the entries in a cell to certain values. However, this is not standard practice as most spreadsheets we receive come with few constraints built into them.

          While databases are indeed not immune to data quality issues, it is much easier for database designers to impose effective constraints on users’ entries. Tasks such as limiting the values in a column to selected entries, ensuring that values are valid dates or numbers, forcing values to be entered in selected fields, and preventing duplicate records from being entered are all easy to implement and standard practice in databases.

          However, properly designed databases can do even more. They can check that various combinations of values make sense—for example:

          • They can prevent users from entering analysis dates that are earlier than the associated sample dates.
          • They can verify that numerical entries are within a permitted range of values and make sense based on past entries. This is so popular its even part of our Locus Mobile app for collecting field data.

          Databases also provide the ability to verify the completeness of your data:

          • Have all samples been collected?
          • Have all analyses been performed on a sample?
          • Are there any analytes missing from the laboratory’s findings?

          You can specify such queries to run at any time. Replicating these checks within Excel, while not impossible, is simply not something most Excel users have the time, skill, or desire to build.


          Complex data - Data redundancy3) It’s easier to prevent data duplication and redundancy when your data resides in your database

          One of the most striking differences between spreadsheets and databases is the prevalence of redundant information in spreadsheets. Consider, for example, these three tables in EIM:

          1. LOCATION
          2. FIELD_SAMPLE
          3. FIELD_SAMPLE_RESULT

          In this subset of their columns, “PK” signifies that the column is a member of the “primary key” of the table. The combination of values in these columns must be unique for any given record.

          Complex data - Table - Primary key

          The two columns LOCATION_ID and SITE_ID can be used to link (join) the information in the FIELD_SAMPLE table. Furthermore, FIELD_SAMPLE_ID and SITE_ID can be used to link the information in FIELD_SAMPLE_RESULT to FIELD_SAMPLE. Because these links exist, we only need to store the above attributes of a given location or field sample once— in one table. This is very different from how data is handled in a single spreadsheet.

          Let’s compare how the data in a few of these columns might appear in a single spreadsheet compared to a database. We’ll look at the spreadsheet first:

          Complex data - Location Table

          Next, let’s see how this information would be stored in a database. Here we can see more fields since we’re not as constrained by width.

          First, the LOCATION table:

          Complex data - Location ID Table

          Then, FIELD_SAMPLE:

          Complex data - Field Sample Table

          Lastly, FIELD_SAMPLE_RESULT:

          Complex data - Field Sample Result Table

          Note one of the most striking differences between the spreadsheet and the database tables above is that much redundant information is included in the spreadsheet. The Location Type of “WELL” is repeated in every record where location MW-01 appears, and the sample date of “04/17/2017” is repeated wherever sample MW-01-12 is present. Redundant information represents one of the most significant drawbacks of using spreadsheets for storing large amounts of data when many of the data values themselves (e.g., LOCATION_ID and FIELD_SAMPLE_ID above) have multiple attributes that need to be stored as well.

          Most spreadsheet data that we have received for import into EIM have consisted of either:

          1. Multiple worksheets of the same or similar formats, all containing a combination of sampling and analytical data
          2. A single worksheet containing tens of thousands of rows of such data

          Occasionally, customers have sent us multiple spreadsheets containing very different types of data, with one or more hosting sample and analytical results, and others containing location, well construction, or other supporting data. However, this is atypical; in most of the migrations that we have performed, redundant data is pervasive in the spreadsheet’s contents and inconsistencies in entries are common.

          Entering new records in a spreadsheet structured like the example above requires that the attributes entered for LOCATION_ID and FIELD_SAMPLE_ID be consistent across all records whose values are the same in these columns.

          The real problems surface when you have to edit records. You must correctly identify all affected records and change them all identically and immediately.

          Sounds relatively straightforward, doesn’t it?

          In fact, judging by what we have seem in our data migrations, discrepancies invariably creep into spreadsheets when edits are attempted. These discrepancies must be resolved when moving the data into a database where constraints prohibit, for example, a single sample from having multiple sample dates, times, purposes, etc.

          In addition, audit trails are all but nonexistent in Excel. Many users tend to save the edited version with a new filename as a crude form of audit tracking. This can quickly lead to a data management nightmare with no documented audit tracking. Just as important, almost all our customers, especially customers involved with regulatory reporting, require audit tracking. This is typically required on sites that may be involved in litigation and decisions are made on the health and safety risks of the site necessitating defensible and unimpeachable data.


          Complex data - Entity relationships4) Entity relationships are more manageable in databases

          The discussion of data duplication and redundancy touches on another significant difference between databases and spreadsheets—how entity relationships are handled.

          Excel stores data in a two-dimensional grid. While it is possible to create relationships between data in different worksheets, this is not the norm and there are many limitations. More often, as we have stated elsewhere, Excel users tend to store their data in a single spreadsheet that grows increasingly unwieldy and hard to read as records are added to it.

          Let’s consider some of the relationships that characterize environmental sampling and analytical data:

          • Sampling locations are associated with sites or facilities—or, for our water utility customers, individual water systems. They may also belong to one or more planned sampling routes.
          • Different sampling locations have their own analytical and field measurement requirements.
          • Individual samples may be associated with one or more specific permits or regulatory requirements.
          • Trip, field, and equipment rinsate samples are linked to one or more regular field samples.
          • Analytical results are assigned to analysis lots and sample delivery groups (SDGs) by the laboratory.
          • Analysis lots and SDGs are the vehicle for linking laboratory QC samples to regular samples.
          • Analytical parameters are associated with one or more regulatory limits.
          • Individual wells are linked to specific boreholes and one or more aquifers.

          Modeling and building these relationships in Excel would be quite difficult. Moreover, they would likely lack most of the checks that a DBMS offers, like preventing orphans (e.g., a location referenced in the FIELD_SAMPLE table that has no entry in the LOCATION table).


          Complex data - Reporting & Integration5) Data reporting and integration is faster and easier with databases

          How do you create a report in Excel? If you’re working with a single spreadsheet, you use the “Data Filter” and “Sort” options to identify the records of interest, then move the columns around to get them in the desired sequence. This might involve hiding some columns temporarily.

          If you make a copy of your data, you can delete records and columns that you don’t want to show. If your data is stored in multiple spreadsheets, you can pull information from one sheet to another to create a report that integrates the different types of data housed in these spreadsheets. But this is a somewhat tedious process for all but the simplest of reports.

          Let’s contrast this drudgery with the simplicity and power offered by relational databases.

          In Locus EIM, for example, you pick the primary and secondary filter categories that you want to use to restrict your output to the records of interest. Then, you select the specific values for these data filter categories (usually from dropdowns or list-builder widgets). There is no limit on how many categories you can filter on.

          Typically, you then choose a date range. Lastly, you pick which data columns you want to view, and in what order. These columns can come from many different tables in the database. For ease of selection, these also appear in dropdowns or list-builder widgets.

          When you have made your filter selections, Locus EIM pulls up the records matching your selection criteria in a data grid. You can further filter the records by values in specific columns in this grid, or hide or rearrange columns. If you want to share or keep a record of these data, you can export the contents of the displayed grid to a text file, Excel, XML, PDF, or copy to your clipboard.

          The list of reports spans all the major types of data stored in Locus EIM, including location and sample collection information, chain of custody and requested analyses data, analytical results, field measurements, and well and borehole data. Additional reports provide options to perform statistical calculations, trend analyses, and comparisons with regulatory and other limits.

          In short, when it comes to generating reports, databases are superior to spreadsheets in almost every aspect. However, that doesn’t mean spreadsheets have no role to play. Many Locus EIM users charged with creating an ad hoc report prefer to download their selected output to Excel, where they apply final formatting and add a title and footer.  Although, with some of the newer reporting tools, such as Locus EIM’s new enhanced formatted reports, that functionality is also built into the DBMS. The more sophisticated the database, the more advanced and robust reporting options will be available.

          12 reasons why commercial SaaS databases are ideal

          Make sure to read the entire series to find out about 12 reasons commercial SaaS databases excel at managing complex environmental data!

          About the author—Gregory Buckle, PhD, Locus Technologies

          Gregory Buckle, PH.D.Dr. Buckle has more than 30 years of experience in the environmental field, most of which have been devoted to the design, development, and implementation of environmental database management systems. When he joined Locus in 1999, he was responsible for building and deploying Locus’ cloud-based EIM software. He was also instrumental in customizing EIM for the water utility industry and developing EIM’s powerful Sample Planning and Data Validation modules. The latest iteration of the Sample Planning module that Dr. Buckle built is currently being used by Los Alamos National Laboratory and San Jose Water Company to plan and schedule thousands of samples per year.


          About the author—Marian Carr, Locus Technologies

          Marian CarrMs. Carr is responsible for managing overall customer solution deployments and customer relationships with Locus’ government accounts. Her career at Locus includes heading the product development team of the award-winning cloud-based environmental ePortal solution as well as maintaining and growing key customer accounts with Locus’ Fortune 100 enterprise deployments. In addition, Ms. Carr was instrumental in driving the growth and adoption of the Locus EIM platform with key federal and water organizations.


           

          Have a question about Locus’ cloud-based environmental software?

            First name

            Last name

            Email address

            Phone number

            Company

            Job title

            Tell us about your company's needs

            Locus is committed to preserving your privacy.