As I speak with people from various technical backgrounds, I find there is lot of confusion about how to structure data. For a full background on this and related topics I rely upon the work of Fabian Pascal, C.J. Date, and C.F. Codd, vibrantly maintained on Pascal's site, dbdebunk.com. I encourage anyone who favors "learning" over "training" to spend some time there.
When I address a new challenge of database development or workflow design, I start by distinguishing between Transaction data and Reference data. Transaction data are those that accumulate over time, but do not change frequently. There are a lot of INSERTs, but not many UPDATEs. Orders, invoices, calls, observations, and events are all types of Transaction data. Transaction data may include attributes that contain Reference data.
Reference data are those that remain mostly static over time, but once established, change infrequently. Examples include addresses, phone numbers, product lists, clients, vendors, etc. Reference data grows at a much slower pace than Transaction data, so much of the maintenance involves de-duplication and consistency of form. For instance, address data can be better de-duplicated if at first you make all the street types consistent: i.e. "Avenue" and "Ave" and "Av" are all made to be the same "Ave". If you can do this BEFORE inserting it into a database, that saves A LOT of time and resources.
By contrast, Transaction data maintenance is more focused on removing data that are no longer needed. The most efficient way to manage Transaction data in a database is to 1) not load attributes that will never be used, and 2) partition data based on attributes that reflect its useful life. This makes it easier to remove it without scanning all the data you need to keep.
So the next time you are thinking about loading up your linked lists, name-value pairs, and dictionaries into a database, remember that you only need to save what you will use. Kind of like your refrigerator. If you are not going to eat it, you don't need to wrap it up or freeze it. Collect only what you need. Determining what you "need" is of course a prerequisite that we can help you to understand.
Early in the history of information technology, multi-user transactional systems introduced the idea of “record locking” to resolve the problem of two humans attempting to edit the same record at the same time. A lot of this had to do with the fact that systems at the time could work with only full records, and not isolate updates to individual fields of a record. The fields were not delimited into separate columns, but rather stored as a continuous string of data across the record. Each field had a fixed width, so reading any particular field required counting characters across the record to find the starting and ending bytes of a field. Unused space was held in reserve for the longest allowable data in a field.
With relational databases, the idea of variable length string data and special purpose data types like datetime, integer, floating point, and eventually, for accurate financial calculations, packed-decimal, numeric, or just “decimal” data types were created. This enabled more efficient use of disk space. The SQL language to access these records included a column-specific syntax to select, insert, or update individual columns of a particular record or records.
Since the entire record had to be read and written as a unit, it made sense to have record locking so that one user changing an employee's address did not step on another user changing that employee's salary. The second user would get an error message that the record was being locked by another user and wait before attempting to enter the salary data again. Once the first user was finished updating the address and the record lock was released, the second user could then update the salary. Since the entire record was read and written as a whole, the data integrity was maintained.
Modern relational database systems use the SQL language which can enable individual columns to be updated, so multiple updates to the same record may be queued into the transaction log before committing to the database. If the application software allows security to be implemented at the column and record level, then the last entry is the final entry. The column-level security prevents users with DISTINCT write access from overwriting one another's data. The payroll user can update salary, but not address. The human resources user can update address, but not salary. Then both users can simultaneous edit the same employee record without conflict. With the increasing speed of computers, record locking became less of a concern than the management of user's tasks, i.e. the division of labor.
If on the other hand, two users are given the same task to update the employee's address, but are given two different addresses, nothing about record locking will identify the “correct” address. If the first user updates to address #1 before of after the second user updates to address #2, then the last entry becomes the final entry. If it was wrong, there would be no indication of how that happened. The user with the “correct” address would simply re-enter the address, to make it right.
Transaction Editor ™ logs who and when changes are made to every record and keeps in a “history” table an exact copy of the record as it existed before the change. This allows management to identify mistakes made during the allocation of user's tasks. It also prevents malicious users from causing permanent loss of information and allows users to recover mistakes, including accidental (or intentional) record deletions, without engaging the services of a database administrator or initiating a disaster recovery plan.
I started my information technology career after earning a degree in Economics in 1982, primarily by earning A's in Econometrics. I just found statistics to be simple to understand and focused my curriculum on a variety of Liberal Arts courses. Near the end I had more Economics credits than any other college, so a B.A. in Economics set me free. The Personal Computer revolution was just beginning and the cost of computing was becoming affordable for small businesses. An Apple II with a C/PM card would get you VisiCalc, WordStar, and dBase II which could easily justify the expense.
My first gig as a freelance programmer preceeded the Internet and amounted to teaching myself Applesoft Basic so I could deliver a Quality Control application to Northridge Hospital in southern California. The Surgical Unit nurses who reviewed the medical records wanted to show physicians the rate of failure (death, relapse) grouped by surgical procedure used, and of course, by physician. So I started to read, and read, and read, about data normalization and relational concepts. E.F. Codd's work had been continued by C.J. Date in several volumes of papers collected in his published works. My comfort with reading academic journals was honed by my Econometrics professors. Currently this important work is continued by Fabian Pascal, whose website is my all time favorite: dbdebunk.com.
Soon after that I took a regular job for the leading marketing research firm in the exploding computer services business, INPUT. At that time, INPUT was the Gartner Group of its day. All large and most small computer services firms subscribed to INPUT's regular reports on the future of particular markets and technologies. I was a Research Analyst, but soon became proficient in advanced forecasting spreadsheets and sifting through the media hype for indicators of future success. In 1984, I authored an INPUT report on the commercial potential of "Artificial Intelligence," which at the time was mostly experimental--except for those seeking venture funding.
Soon I began to "Awaken", as Henri Bergson puts it, to my "spiritual immune system." My "journey is made by ... choosing knowledge over the veils of ignorance." Like religion, beliefs are learned and earned in software technology through intense study. Unfortunately, most public commentary about data normalization makes no sense, have no academic merit, and are vigorously defended by those graduate students trying to free themselves into the world of consulting. Fabian Pascal nails these commentaries to his wall every week and responds with robust defensible academic criticism. I thank him and C.J. Date for saving me from wasting any part of my life learning stupid or regressive technologies that regularly come into and out of favor. For instance, the "relational" vs. "object" debate finally came to reconciliation with "The Third Manifesto." But not many freelance web programmers know that.
I eventually left INPUT and started a freelance web programmer practice developing and selling application development and database development services for reinforcing steel fabricators. That market was short-lived, so back to a job I went. Having spent several years in database development for, among others, Travelers, GE Capital, and Hewlett-Packard, I came to the realization that by separating the business logic from the presentation logic I could improve code reuse. So I created a system that uses database metadata to drive the presentation for any database table structure and dynamically creates user interface controls for displaying related data. Soon thereafter, HTML became the preferred presentation layer descriptive language for application development, and Transaction Editor was born.
The owner's of small brick and mortar businesses who use Transaction Editor for their unique database development remind me every month how glad they are to have met me. For them, I solve the application development backlog permanently. They don't have to worry about losing the resource who wrote their application, finding another developer able to understand the language/framework/components and willing to study the business logic encapsulated in their application. I am devoted to serving small businesses by giving them enterprise-class functionality at an affordable price. My challenge to growth is to rise above the crowd of freelance web programmers and have a long enough conversation with potential clients to demonstrate my capabilities with the Transaction Editor service model.
I have used Guru.com for years and more recently added Freelancer.com. I DON'T USE Upwork.com (formerly Elance.com and Odesk.com) because they place too many restrictions on freelancers, which are a dis-service to clients. But browsing projects on these sites is a time consuming task mostly for their lacking search capabilities. So in searching again for other channels to reach small businesses, I came across the latest entrant into the world of recruiter disintermediation, Toptal Software programmers network. I've had a long career of recruiter-led on-site engagements. But Toptal and recruiters in general, like their Human Resources counterparts, are focused on finding experience in particular technologies. Again, failing to service clients with advised solutions.
The types of small businesses that Transaction Editor helps do not use recruiters. They may ask their accountant or desktop support person for a freelance web programmer referral, but their accountants are not typically versed in application development enough to provide a recommendation. I have had some success networking with desktop support and network engineering firms. Younger small businesses seem comfortable searching the Internet, only to be inundated with questions about language and platform and framework, choices that they could not care less about. Oftentimes I use Transaction Editor to replace a power-user home-grown spreadsheet or MS Access database application that has grown beyond their capabilities to support. On occasion, they will have a 20-year-old custom application developed in some long-ago lost technology whose support has been sunset.
Because the people who need my services are unique businesses with no technical staff looking for a solution to a business process problem, they may not find me with a limited vocabulary of languages, platforms, and frameworks they do not understand. The service I offer DOES eliminate the application backlog, is readily adopted by most brick and mortar businesses, and continues to provide me with multiple micro-streams of income. Save them from the pain of re-writing their applications every 3-5 years by recommending me as their freelance web programmer. They and I will be glad you did.
Commercial off-the-shelf (COTS) software provide functional capabilities commonly shared across multiple users. The ability to fit large groups of users into a single class having common needs and objectives benefits that class through the economies of scale that COTS offers. Word processing, spreadsheet, and presentation applications are generalized COTS software suitable for a large audience of office users. This allows significant economies of scale if the user base is large.
Business transaction management becomes less common the further you go up the transaction stream from general ledger (GL) and financial reporting needs. Generally accepted accounting principals (GAAP) allow many businesses to share the same COTS software. When it comes to Accounts Payable (AP) and Accounts Receivable (AR) accounting, things start to get more complicated. A financial services firm will need to handle Receivables differently than a manufacturer, and may have significantly fewer functional requirements for Payables.
To feed AR transactions, Order Processing capabilities needed by a manufacturer of lightbulbs (discrete manufacturing) is very different from the needs of a manufacturer of food additives (process manufacturing). Among discrete manufacturers, the transactions leading to the formation of AP transactions may require a more complex handling of the Bill of Materials (BOM). A computer has many more parts than a lightbulb and may involve a BOM of components within a BOM for finished goods.
The Bill of Lading (BOL) a concrete reinforcing steel fabricator needs must include weights of packaging that is uniquely determined by the lift capacity of cranes. While the BOL of a process manufacturer of food additives or hazardous chemicals must include product component details to comply with safety regulations.
If your business competes on having trade secrets in process controls and methods, the last thing you want is software that shares these secrets with your competitors, which is what COTS is all about.
By the time you are reading this, you have already determined that you need software to solve some businesss problem. You have already considered the features you need. You may have even made some forms to collect the data you need. You just need to determine if it is more cost-effective to buy something off-the-shelf or to build it from your requirements. You need to consider four things to make this decision.
Application updates and technical support require some sort of access to the computer systems
hosting the application.
A) What is your existing information access/security strategy? Does it require a badged individual with Top Security clearance to be on site? Or can an outside vendor connect over an internet connection into your user’s workstations to provide end-user support?
B) Are data integrations with other systems required? What are the access contraints of those systems?
C) Does your organization have platform, language, or framework standards that constrain your choices?
No matter the size, most businesses have an existing system or two within the organization serving some purpose. Your may consider that with some modification, these systems may satisfy your need cost-effectively. Existing systems often cannot be modified because of insufficient documentation or support. It cannot be modified if no one understands it, has the source code, or exists to support it. Or it may just be too costly to modify it. This widens the scope of your requirements to include replacing the existing capabilities.
Even if your requirements are outside the scope modifying of existing software, you may have to integrate new software with existing systems to provide transactional data. This adds cost to your project for new feature development and depends heavily on your organization’s other standards and constraints. These constraints can also limit the availability of human resources with the required skills to get your project done.
COTS software rarely satisfies all of a business’s requirements. Oftentimes, COTS software is designed to satisfy the majority of requirements for all users in a certain class. Some COTS software will allow simple modifications like adding a few custom data fields that can be manipulated in the application’s reports. But if reporting is incomplete without integrating the data from the new COTS system with other applications, then this integration will still be necessary. Such customization of COTS software can make updates and patches extremely burdensome to the organization.
Evaluate the cost of customizing, integrating, and supporting COTS software. Evaluate the cost of writing your software solution in-house or hiring a professional services vendor. These comparisons involve both hard and soft costs. Hard costs are what you pay to get the features that can be delivered. Soft costs are what it costs you in time to use and support the system. Soft costs also include the marginal cost of NOT getting any features one way or the other. It is in the negotiation of these soft marginal costs that great decisions are made.
Scope creep has killed many projects due to cost overruns. When users start to use a system, they find a way to communicate additional requirements. Well into the initial phases of a software development project, a good programmer analyst will discover these requirements and document change orders for the marginal cost of adding these features. The marginal benefits can be insignificant. For instance, if automating a process to move data to a file on a network saves a user four hours of manual work to complete the same task once per month, does this justify the cost of development of automating that process? Not likely. If it saves them one hour a day, then it is more likely beneficial enough to automate it.
Just like COTS, Custom Developed software also has continuing support costs for patches and updates. The sooner you can get users using a Custom Developed solution, the quicker you will discover these needs. You will want to minimize your financial exposure to getting users on the system early.
You should choose a COTS solution IF:
A) It complies with your organization’s constraints.
B) It meets most of your functional requirements for the next 3-5 years with little or no customization.
C) It does not have more features than you will use, but for which you must still pay.
D) The support costs do not exceed your operating budget.
You should choose a Custom Developed solution IF:
A) No COTS software adequately satisfies the business requirements without material modifications.
B) Support costs are within your operating budget.
Here is a plea from a small business owner who wants to own their own software:
"I had a programmer, who is no longer in the picture, develop software for me. He
placed the program on his server. I can use the program but, I cannot register others
to use it, etc. I need someone to put the program on MY server where I have control."
OK. So you own the intellectual property rights to some code that helps your business succeed. But what if your business needs change? You still need someone to take responsibility for understanding that code, have the skills to modify that code, and be interested and compensated sufficiently to continue to support that code. If you own it, now you are in the software business. You need to keep an employee or a freelance web programmer interested in your business application long enough to survive the next operating system or framework upgrade, business process change, or hardware upgrade.
What if the framework vendor decides it no longer wants to support that framework? Vendor acquisitions, product sunsets, or management changes usually result in someone (or a lot of someones) rewriting the same code over again every 3-5 years. So now your precious intellectual property is made obsolete by forces beyond your control, those external forces operating behind the scenes that only seasoned IT executives understand all too well.
But you just want to run your business as efficiently as possible. A software package made for your industry seems appropriate. That is, until that vendor gets gobbled up by a competitor or makes upgrades to features you don't use. Again, you are back to rebuilding into code your unique business process--your trade secrets--that make your business run efficiently.
Standards. Yes, the answer is standards. But what is the standard today? When and how will that standard change over time, and will that affect your application development? This is a moving target that is at the bleeding edge of the software field. For business applications that process transactions, does this really need to change as often as software vendors churn out new features? Why should your IT support persons change more often than your accountant?
Business computer systems serve two purposes. They keep track of Things and they keep track of Events (i.e. transactions). For example: "Sold 34 Widgets on Thursday". "Widgets"=Thing. "Sold 34 on Thursday"=Event. Not a lot has changed in 40 years with respect to the technology needed to handle this. HTML behaves like CICS. Relational databases replaced ISAM. The software industry's penchant for "new and improved" is why spreadsheets run half the world. But I digress.
If you focus on the SERVICES related to keeping order among your chaos, you will likely succeed with less technology pain. Like having a good CPA keep track of your books or an ATTORNEY keep track of your contracts, having a good professional keep track of your software development will likely make you more successful, measured in your preference of either more profit or more free time.
So what is this type of "professional" called? There isn't one name for such a person. There are no well known licensing boards certifying information technology professionals. In fact, if you asked a hiring manager (or professional recruiter for that matter) what to look for in a freelance web programmer, they would list a litany of products, technologies, and programming languages. If carpenters were hired like programmers, the interview would go something like this. If you read any of the results from that link, you get my point.
Bottom line, technology matters, but only if you have the professionals who can sift through the pile and find your pony in there.
The Holy Grail of application development is a tool that empowers the individual to "build it themselves." Spreadsheets, Wordpress, and SharePoint seem to lead in that space. There is no shortage of application development tools. You can search the Internet for application development tools only to discover that most solutions are so focused and simplistic, they hardly do anything. You can spend six months to learn how to build your application, only to spend the next six months upgrading it for the next release, until the vendor "sunsets" your platform to make way for the Next Big Thing.
To make software useful requires people who understand your business. Unfortunately, staffing or finding a freelance web programmer who knows the particular language, libraries, or tools that your application is built upon is no small task. This is made even more challenging by the ever-improving offers by old and new vendors. Technical recruiters and HR people struggle with defining the skills needed because there are literally hundreds of ways to build a business software application. There really are no "standards".
What businesses are lacking is standards, real standards. Imagine what software industry "standards" would read like if they were applied to the construction industry.
Pipe Fittings. Available in a variety of materials, you can connect your toilet to your water source by configuring your tank prefilter with any number of other Open Source materials. After extruding the material through your 3D printer and threading the connectors with just about any threading tool, you can form the outer casing to fit your team's own wrench set. For a fee, we can provide you with our wrench set.
Wall Components. We give you everything you need to build your own house. Solid Pine logs in standard lengths can be trimmed to any dimension using freely available Open Source band saws. They can easily be bound together with your choice of off-the-shelf components including rope, tar, steel shanks (headed or headless), and leather trimmings.
Electrical Wiring. Our professional installers can get you up and running in no time. We work in one metal (copper, aluminum, brass), use only certified Acme parts, and will give you copyright and patents on every bend and connection made throughout your house.
In my article "I Want To Own My Own Software -- Really?", I reference a mock interview dialog called "If Carpenters Were Hired Like Programmers" . Read any of the results from that search.