What's new in SPSS Statistics Developer 21.0.0.0

Jul 17, 2013
  • Simulation. Predictive models, such as linear regression, require a set of known inputs to predict an outcome or target value. In many real world applications, however, values of inputs are uncertain. Simulation allows you to account for uncertainty in the inputs to predictive models and evaluate the likelihood of various outcomes in the presence of that uncertainty. See the topic Simulation for more information.
  • One-click descriptive statistics. Select variables in the Data Editor and get summary descriptive statistics (for example, mean, median, frequency counts). Appropriate statistics are automatically determined based on measurement level. See the topic Obtaining Descriptive Statistics for Selected Variables for more information.
  • Read Cognos Business Intelligence data. If you have access to an IBM® Cognos® Business Intelligence server, you can read data packages and list reports into IBM SPSS Statistics. See the topic Reading Cognos data for more information.
  • Merge data files without pre-sorting. Merge data files by values of key variables without pre-sorting the files based on key values. You can also merge data files based on string keys of different defined lengths in each file and merge a case data file with multiple table-lookup files with different keys in each table-lookup file. See the topic STAR JOIN for more information.
  • Compare datasets. Compare the data values and metadata attributes (dictionary information) of two datasets. See the topic Comparing datasets for more information.
  • Password protect and encrypt data and output files. See the topic Encrypting data files and output documents for more information.
  • Pivot table editing enhancements. After creating pivot tables, you can now:
  • Toggle the display of names, values, and labels. See the topic Controlling display of variable and value labels for more information.
  • Sort table rows. See the topic Sorting rows for more information.
  • Insert rows and columns. See the topic Inserting rows and columns for more information.
  • Change the output language. See the topic Changing the output language for more information.
  • Export output in Excel 2007 and higher format. See the topic Export output for more information.
  • Preserve table styles when exporting output to HTML. All pivot table style information (for example, font styles, background colors) and column widths can now be preserved. See the topic HTML options for more information.
  • Unicode default. SPSS Statistics now runs in Unicode mode by default instead of code page mode.

New in SPSS Statistics Developer 20.0.0.0 (Jul 17, 2013)

  • Maps. The Graphboard Template Chooser now includes templates for creating different types of map visualizations, such as choropleth maps (color maps), maps with mini-charts, and overlay maps. IBM® SPSS® Statistics ships with several map files, but you can use the Map Conversion Utility to covert your existing map shapefiles for use with the Graphboard Template Chooser. See the topic Using the Map Conversion Utility for more information.
  • Faster rendering of pivot tables. Pivot tables now render much faster than in previous versions, while retaining full support for pivoting and editing. If you used fast rendering of lightweight tables in version 19, you will find comparable results for pivot tables in version 20 and higher, without the limitations of lightweight tables. Users who require compatibility with versions prior to 20 can choose to generate legacy tables (referred to as full-featured tables in version 19). See the topic Pivot table options for more information.
  • Background, disconnected execution for production jobs. Production jobs can be run in a separate background session on a remote server. You can submit the jobs from your local computer, disconnect from the remote server, reconnect later and retrieve your results. You don't need to keep SPSS Statistics running on your local computer. You don't even need to keep your local computer turned on. Progress of remote jobs can be monitored and results retrieved from the new Background Job Status tab of the production facility dialog.See the topic Production jobs for more information.
  • Ordinal Targets for Generalized linear mixed models. The Generalized linear mixed models procedure now uses the information in the ordering of categories of targets with the ordinal measurement level. Ordinal targets are modeled with an ordinal multinomial distribution, and the target is linearly related to the factors and covariates via one of a number of cumulative link functions. This feature is available in the Advanced Statistics add-on option

New in SPSS Statistics Developer 19.0.0.0 (Jul 17, 2013)

  • Linear models. Linear models predict a continuous target based on linear relationships between the target and one or more predictors. Linear models are relatively simple and give an easily interpreted mathematical formula for scoring. The properties of these models are well understood and can typically be built very quickly compared to other model types (such as neural networks or decision trees) on the same dataset. This feature is available in the Statistics Base add-on module. See the topic Linear models for more information.
  • Generalized linear mixed models. Generalized linear mixed models extend the linear model so that: the target is linearly related to the factors and covariates via a specified link function; the target can have a non-normal distribution; and the observations can be correlated. Generalized linear mixed models cover a wide variety of models, from simple linear regression to complex multilevel models for non-normal longitudinal data. This feature is available in the Advanced Statistics add-on module. See the topic Generalized linear mixed models for more information.
  • Lightweight tables. Lightweight tables can be rendered much faster than full-featured pivot tables. Although they lack the editing features of pivot tables, they can easily be converted to pivot tables with all editing features enabled. See the topic Pivot table options for more information.
  • Scoring wizard. The new scoring wizard makes it easy to apply predictive models to score your data, and scoring no longer requires IBM® SPSS® Statistics Server. See the topic Scoring data with predictive models for more information.
  • Improved default measurement level. For data read from external sources and new variables created in a session, the method for determining default measurement level has been improved to evaluate more conditions than just the number of unique values. Since measurement level affects the results of many procedures, correct measurement level assignment is often important. See the topic Data Options for more information.
  • "Smart" output. The procedures in the Direct Marketing add-on module now provide "smart" output: simple, non-technical explanations that help you evaluate your results.
  • Syntax editor enhancements. You can now split the editor pane into two panes arranged with one above the other. You can indent or outdent blocks of syntax or automatically indent selections with a format similar to pasted syntax. A new toolbar button allows you to uncomment text that was previously commented out, and a new option setting allows you to paste syntax at the position of the cursor. You can now also navigate to the next or previous syntactical error (such as an unmatched quote), making it easier to locate these errors before running the syntax. See the topic Using the Syntax Editor for more information.
  • Database drivers for salesforce.com. Database drivers for salesforce.com allow an analyst to access data in salesforce.com just like you access data in a SQL database. Analysts can now connect to salesforce.com, extract data that is relevant and perform analysis.
  • Compiled transformations. When you use compiled transformations, transformation commands (such as COMPUTE and RECODE) are compiled to machine code at run time to improve the performance of these transformations for datasets with a large number of cases. This feature requires SPSS Statistics Server.
  • Statistics portal. Statistics portal is a Web-based interface for IBM SPSS Collaboration and Deployment Services users that allows them to analyze their data with the power of the SPSS Statistics engine. They run analyses from custom user interfaces authored in SPSS Statistics (with the Custom Dialog Builder) and stored in their IBM SPSS Collaboration and Deployment Services Repository. Enhancements relevant to authors of custom user interfaces for Statistics portal include: honoring a filter, specified for the active dataset, between successive analyses; hiding small counts in tables generated by CROSSTABS, OLAP CUBES, and CTABLES; and displaying a set of row and column dimensions as table layers in the CROSSTABS crosstabulation table.

New in SPSS Statistics Developer 18.0.0.0 (Jul 17, 2013)

  • Automated data preparation. Automated Data Preparation (ADP) handles the task of preparing data for analysis, analyzing your data and identifying fixes, screening out fields (variables) that are problematic or not likely to be useful, deriving new attributes when appropriate, and improving performance through intelligent screening techniques. You can use the algorithm in fully automatic fashion, allowing it to choose and apply fixes, or you can use it in interactive fashion, previewing the changes before they are made and accept or reject them as desired. Automated Data Preparation is available in the Data Preparation add-on option. See the topic Automated Data Preparation for more information.
  • Bootstrapping. Bootstrapping is a robust method for determining the properties of population estimators (like the mean, median, percentiles, and correlation and regression coefficients) when parametric assumptions do not hold, or when inferences based on parametric assumptions are difficult to compute. Bootstrapping is available in the new Bootstrapping add-on option. See the topic Introduction to Bootstrapping for more information.
  • New nonparametric tests. Nonparametric tests make minimal assumptions about the underlying distribution of the data. The new nonparametric tests provide a new user interface and Model Viewer output, and include all of the tests available in the legacy nonparametric tests, including: one-sample Wilcoxon signed-rank test, one-sample confidence intervals for the binomial distribution, the related-samples marginal homogeneity test, and the Hodges-Lehman confidence interval for the median of the difference in paired-samples and the difference in medians of two independent samples. Pairwise and stepwise step-down multiple comparisons are also available for all k independent samples and k related samples tests. The Jonckheere-Terpstra test is available without requiring the Exact Tests add-on option. The new nonparametric tests are available in the Statistics Base add-on option. See the topic Nonparametric Tests for more information.
  • Programmability enhancements. The R Integration Plug-in now supports R debugging features. Additionally, you can create pivot tables from R with multiple row and column dimensions and you can nest multiple pivot tables under a common outline heading. R extension commands can be implemented directly from R source code files, bypassing the need to distribute them as R packages. Also, you can bundle together all components of a custom R or Python procedure, allowing end users to easily install the procedure without manually copying files. Complete documentation for the Python and R Integration Plug-ins is now integrated with the Help system.
  • Direct marketing tools. The new Direct Marketing add-on option provides a set of tools designed to improve the results of direct marketing campaigns by identifying demographic, purchasing, and other characteristics that define various groups of consumers and targeting specific groups to maximize positive response rates. See the topic Direct Marketing for more information.
  • Custom Tables enhancements. The Custom Tables add-on option now offers computed categories and significance results integrated into the same table as the values being tested. For more information on computed categories, see Computed Categories. For more information on significance tests in custom tables, see Custom Tables: Test Statistics Tab.
  • Improved SAS data file support. You can now write data files in SAS 9 format. See the topic Saving data: Data file types for more information.
  • Improved Custom Dialog Builder. The Custom Dialog Builder now has a list box control that supports single or multiple selection. Also, list items for combo box and list box controls can now be dynamically populated with values associated with the variables in a specified target list. In addition, radio buttons can now contain a set of nested controls. See the topic Creating and Managing Custom Dialogs for more information.
  • Improved display of large pivot tables. New display options are now available that make it easier to view and navigate large pivot tables (tables with hundreds or thousands of rows). See the topic Set rows to display for more information.
  • Improved Twostep Cluster output. The Twostep Cluster procedure now provides interactive model viewer output. Twostep Cluster is available in the Statitics Base option. See the topic The Cluster Viewer for more information.
  • Additional rule-checking on quality control charts. Rule-checking is now performed on several additional control charts. When rule-checking is requested for an X-bar chart, it will also be performed on the accompanying R (range) or s (standard deviation) chart. Similarly, when rule-checking is requested for an Individuals (Runs) chart, it will also be performed on the accompanying Moving Range chart. Quality control charts are available in the Statistics Base option

New in SPSS Statistics Developer 17.0.0.0 (Jul 17, 2013)

  • New syntax editor. The syntax editor has been completely redesigned with features such as auto-completion, color coding, bookmarks, and breakpoints. Auto-completion provides you with a list of valid command names, subcommands, and keywords; so you’ll spend less time referring to syntax charts. Color coding allows you to quickly spot unrecognized terms as well as some common syntactical errors. Bookmarks allow you to quickly navigate large command syntax files. Breakpoints allow you to stop execution at specified points so you can inspect data or output before proceeding. See the topic Using the Syntax Editor for more information.
  • Custom Dialog Builder. The Custom Dialog Builder allows you to create and manage custom dialogs for generating command syntax. You can create custom dialogs to generate syntax from multiple commands, including custom extension commands implemented in Python or R. See the topic Creating and Managing Custom Dialogs for more information.
  • Multiple language support. In addition to the ability to change the output language available in previous releases, you can now change the user interface language. See the topic General options for more information.
  • Codebook. The Codebook procedure reports the dictionary information -- such as variable names, variable labels, value labels, missing values -- and summary statistics for all or specified variables and multiple response sets in the active dataset. For nominal and ordinal variables and multiple response sets, summary statistics include counts and percents. For scale variables, summary statistics include mean, standard deviation, and quartiles. See the topic Codebook for more information.
  • Nearest Neighbor analysis. Nearest Neighbor analysis is a method for classifying cases based on their similarity to other cases. In machine learning, it was developed as a way to recognize patterns of data without requiring an exact match to any stored patterns, or cases. Similar cases are near each other and dissimilar cases are distant from each other. Thus, the distance between two cases is a measure of their dissimilarity. See the topic Nearest Neighbor Analysis for more information.
  • Multiple Imputation. The Multiple Imputation procedure performs multiple imputation of missing data values. Given a dataset containing missing values, it outputs one or more datasets in which missing values are replaced with plausible estimates. You can then obtain pooled results when running other procedures. The procedure also summarizes missing values in the working dataset. This feature is available in the Missing Values add-on option. See the topic Impute Missing Data Values (Multiple Imputation) for more information.
  • RFM analysis. RFM (recency, frequency, monetary) analysis is a technique used to identify existing customers who are most likely to respond to a new offer. This technique is commonly used in direct marketing. This feature is available in the EZ RFM add-on option. See the topic RFM Analysis for more information.
  • Categorical Regression enhancements. Categorical Regression has been enhanced to include regularization and resampling methods to assess and improve prediction accuracy. Together, these new methods make it possible to create state-of-the-art models, even for high-volume data (where there are more variables than observations, such as in genomics). This feature is available in the Categories add-on option. See the topic Categorical Regression (CATREG) for more information.
  • Graphboard. Graphboard visualizations are graphs, charts, and plots created from a visualization template. IBM® SPSS® Statistics ships with built-in visualization templates. You can also use a separate product, IBM® SPSS® Visualization Designer, to create your own visualization templates. The new visualization templates are effectively custom visualization types. See the topic Creating and Editing Graphboard Visualizations for more information.
  • Exporting output. More output export format options and more control over exported content, including:
  • Wrap or shrink wide table in Word documents. See the topic Word/RTF options for more information.
  • Create new worksheets or append data to existing worksheets in an Excel workbook. See the topic Excel options for more information.
  • Save output export specifications in the form of command syntax with the OUTPUT EXPORT command. All the features for exporting output in the Export Output dialog are now also available in command syntax; so you can save and re-run your export specifications and include them in automated production jobs. See the topic OUTPUT EXPORT for more information.
  • The Output Management System (OMS) now supports these additional output formats: Word, Excel, and PDF. See the topic Output Management System for more information.
  • Shift Values. Shift Values creates new variables that contain the values of existing variables from preceding (lag) or subsequent (lead) cases. See the topic Shift Values for more information.
  • Aggregate enhancements. You can now use the features of the Aggregate procedure without specifying a break variable. See the topic Aggregate Data for more information.
  • Median function. A median function is now available for computing the median value across selected variables for each case. See the topic Statistical functions for more information.