Apache Hive Changelog

What's new in Apache Hive 1.2.0

May 19, 2015
  • Sub-task:
  • [HIVE-8119] - Implement Date in ParquetSerde
  • [HIVE-8164] - Adding in a ReplicationTask that converts a Notification Event to actionable tasks
  • [HIVE-8165] - Annotation changes for replication
  • [HIVE-8379] - NanoTimeUtils performs some work needlessly
  • [HIVE-8696] - HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
  • [HIVE-8817] - Create unit test where we insert into an encrypted table and then read from it with pig
  • [HIVE-8818] - Create unit test where we insert into an encrypted table and then read from it with hcatalog mapreduce
  • [HIVE-9009] - order by (limit) meaning for the last subquery of union in Hive is different from other main stream RDBMS
  • [HIVE-9253] - MetaStore server should support timeout for long running requests
  • [HIVE-9271] - Add ability for client to request metastore to fire an event
  • [HIVE-9273] - Add option to fire metastore event on insert
  • [HIVE-9327] - CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
  • [HIVE-9333] - Move parquet serialize implementation to DataWritableWriter to improve write speeds
  • [HIVE-9432] - CBO (Calcite Return Path): Removing QB from ParseContext
  • [HIVE-9501] - DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
  • [HIVE-9508] - MetaStore client socket connection should have a lifetime
  • [HIVE-9550] - ObjectStore.getNextNotification() can return events inside NotificationEventResponse as null which conflicts with its thrift "required" tag
  • [HIVE-9558] - [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode
  • [HIVE-9563] - CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]
  • [HIVE-9571] - CBO (Calcite Return Path): Generate FileSink Op [CBO branch]
  • [HIVE-9582] - HCatalog should use IMetaStoreClient interface
  • [HIVE-9585] - AlterPartitionMessage should return getKeyValues instead of getValues
  • [HIVE-9657] - Use new parquet Types API builder to construct data types
  • [HIVE-9666] - Improve some qtests
  • [HIVE-9690] - Refactoring for non-numeric arithmetic operations
  • [HIVE-9750] - avoid log locks in operators
  • [HIVE-9792] - Support interval type in expressions/predicates
  • [HIVE-9810] - prep object registry for multi threading
  • [HIVE-9819] - Add timeout check inside the HMS server
  • [HIVE-9824] - LLAP: Native Vectorization of Map Join
  • [HIVE-9894] - Use new parquet Types API builder to construct DATE data type
  • [HIVE-9906] - Add timeout mechanism in RawStoreProxy
  • [HIVE-9937] - LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
  • [HIVE-9982] - CBO (Calcite Return Path): Prune TS Relnode schema
  • [HIVE-9998] - Vectorization support for interval types
  • [HIVE-10037] - JDBC support for interval expressions
  • [HIVE-10044] - Allow interval params for year/month/day/hour/minute/second functions
  • [HIVE-10053] - Override new init API fom ReadSupport instead of the deprecated one
  • [HIVE-10071] - CBO (Calcite Return Path): Join to MultiJoin rule
  • [HIVE-10076] - Bump up parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6
  • [HIVE-10131] - LLAP: BytesBytesMultiHashMap and mapjoin container should reuse refs
  • [HIVE-10227] - Concrete implementation of Export/Import based ReplicationTaskFactory
  • [HIVE-10228] - Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
  • [HIVE-10243] - CBO (Calcite Return Path): Introduce JoinAlgorithm Interface
  • [HIVE-10252] - Make PPD work for Parquet in row group level
  • [HIVE-10262] - CBO (Calcite Return Path): Temporarily disable Aggregate check input for bucketing
  • [HIVE-10263] - CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional
  • [HIVE-10326] - CBO (Calcite Return Path): Invoke Hive's Cumulative Cost
  • [HIVE-10329] - Hadoop reflectionutils has issues
  • [HIVE-10343] - CBO (Calcite Return Path): Parameterize algorithm cost model
  • [HIVE-10347] - Merge spark to trunk 4/15/2015
  • [HIVE-10350] - CBO: Use total size instead of bucket count to determine number of splits & parallelism
  • [HIVE-10369] - CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled
  • [HIVE-10375] - CBO (Calcite Return Path): disable the identity project remover for some union operators
  • [HIVE-10386] - CBO (Calcite Return Path): Disable Trivial Project Removal on ret path
  • [HIVE-10391] - CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column
  • [HIVE-10400] - CBO (Calcite Return Path): Exception when column name contains dot or colon characters
  • [HIVE-10413] - [CBO] Return path assumes distinct column cant be same as grouping column
  • [HIVE-10416] - CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite
  • [HIVE-10426] - Rework/simplify ReplicationTaskFactory instantiation
  • [HIVE-10455] - CBO (Calcite Return Path): Different data types at Reducer before JoinOp
  • [HIVE-10462] - CBO (Calcite Return Path): MapJoin and SMBJoin conversion not triggered
  • [HIVE-10493] - Merge multiple joins when join keys are the same
  • [HIVE-10506] - CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
  • [HIVE-10512] - CBO (Calcite Return Path): SMBJoin conversion throws ClassCastException
  • [HIVE-10520] - LLAP: Must reset small table result columns for Native Vectorization of Map Join
  • [HIVE-10522] - CBO (Calcite Return Path): fix the wrong needed column names when TS is created
  • [HIVE-10526] - CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account
  • [HIVE-10547] - CBO (Calcite Return Path) : genFileSinkPlan uses wrong partition col to create FS
  • [HIVE-10549] - CBO (Calcite Return Path): Enable NonBlockingOpDeDupProc
  • Bug:
  • [HIVE-3454] - Problem with CAST(BIGINT as TIMESTAMP)
  • [HIVE-4625] - HS2 should not attempt to get delegation token from metastore if using embedded metastore
  • [HIVE-5545] - HCatRecord getInteger method returns String when used on Partition columns of type INT
  • [HIVE-5672] - Insert with custom separator not supported for non-local directory
  • [HIVE-6069] - Improve error message in GenericUDFRound
  • [HIVE-6099] - Multi insert does not work properly with distinct count
  • [HIVE-6950] - Parsing Error in GROUPING SETS
  • [HIVE-7351] - ANALYZE TABLE statement fails on postgres metastore
  • [HIVE-7641] - INSERT ... SELECT with no source table leads to NPE
  • [HIVE-8524] - When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS
  • [HIVE-8626] - Extend HDFS super-user checks to dropPartitions
  • [HIVE-8746] - ORC timestamp columns are sensitive to daylight savings time
  • [HIVE-8890] - HiveServer2 dynamic service discovery: use persistent ephemeral nodes curator recipe
  • [HIVE-8915] - Log file explosion due to non-existence of COMPACTION_QUEUE table
  • [HIVE-9002] - union all does not generate correct result for order by and limit
  • [HIVE-9023] - HiveHistoryImpl relies on removed counters to print num rows
  • [HIVE-9073] - NPE when using custom windowing UDAFs
  • [HIVE-9083] - New metastore API to support to purge partition-data directly in dropPartitions().
  • [HIVE-9086] - Add language support to PURGE data while dropping partitions.
  • [HIVE-9115] - Hive build failure on hadoop-2.7 due to HADOOP-11356
  • [HIVE-9118] - Support auto-purge for tables, when dropping tables/partitions.
  • [HIVE-9151] - Checking s against null in TezJobMonitor#getNameWithProgress() should be done earlier
  • [HIVE-9228] - Problem with subquery using windowing functions
  • [HIVE-9303] - Parquet files are written with incorrect definition levels
  • [HIVE-9322] - Make null-checks consistent for MapObjectInspector subclasses.
  • [HIVE-9350] - Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'
  • [HIVE-9397] - SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
  • [HIVE-9430] - NullPointerException on ALTER TABLE ADD PARTITION if no value given
  • [HIVE-9438] - The standalone-jdbc jar missing some jars
  • [HIVE-9456] - Make Hive support unicode with MSSQL as Metastore backend
  • [HIVE-9468] - Test groupby3_map_skew.q fails due to decimal precision difference
  • [HIVE-9471] - Bad seek in uncompressed ORC, at row-group boundary.
  • [HIVE-9472] - Implement 7 simple UDFs added to Hive
  • [HIVE-9474] - truncate table changes permissions on the target
  • [HIVE-9481] - allow column list specification in INSERT statement
  • [HIVE-9482] - Hive parquet timestamp compatibility
  • [HIVE-9484] - ThriftCLIService#getDelegationToken does case sensitive comparison
  • [HIVE-9486] - Use session classloader instead of application loader
  • [HIVE-9489] - add javadoc for UDFType annotation
  • [HIVE-9496] - Slf4j warning in hive command
  • [HIVE-9507] - Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
  • [HIVE-9509] - Restore partition spec validation removed by HIVE-9445
  • [HIVE-9512] - HIVE-9327 causing regression in stats annotation
  • [HIVE-9513] - NULL POINTER EXCEPTION
  • [HIVE-9526] - ClassCastException thrown by HiveStatement
  • [HIVE-9529] - "alter table .. concatenate" under Tez mode should create TezTask
  • [HIVE-9539] - Wrong check of version format in TestWebHCatE2e.getHiveVersion()
  • [HIVE-9553] - Fix log-line in Partition Pruner
  • [HIVE-9555] - assorted ORC refactorings for LLAP on trunk
  • [HIVE-9560] - When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will result in value '0' after running 'analyze table TABLE_NAME compute statistics;'
  • [HIVE-9565] - Minor cleanup in TestMetastoreExpr.
  • [HIVE-9567] - JSON SerDe not escaping special chars when writing char/varchar data
  • [HIVE-9580] - Server returns incorrect result from JOIN ON VARCHAR columns
  • [HIVE-9587] - UDF decode should accept STRING_GROUP types for the second parameter
  • [HIVE-9588] - Reimplement HCatClientHMSImpl.dropPartitions() with HMSC.dropPartitions()
  • [HIVE-9592] - fix ArrayIndexOutOfBoundsException in date_add and date_sub initialize
  • [HIVE-9609] - AddPartitionMessage.getPartitions() can return null
  • [HIVE-9612] - Turn off DEBUG logging for Lazy Objects for tests
  • [HIVE-9613] - Left join query plan outputs wrong column when using subquery
  • [HIVE-9617] - UDF from_utc_timestamp throws NPE if the second argument is null
  • [HIVE-9619] - Uninitialized read of numBitVectors in NumDistinctValueEstimator
  • [HIVE-9620] - Cannot retrieve column statistics using HMS API if column name contains uppercase characters
  • [HIVE-9622] - Getting NPE when trying to restart HS2 when metastore is configured to use org.apache.hadoop.hive.thrift.DBTokenStore
  • [HIVE-9623] - NullPointerException in MapJoinOperator.processOp(MapJoinOperator.java:253) for TPC-DS Q75 against un-partitioned schema
  • [HIVE-9624] - NullPointerException in MapJoinOperator.processOp(MapJoinOperator.java:253) for TPC-DS Q75 against un-partitioned schema
  • [HIVE-9628] - HiveMetaStoreClient.dropPartitions(...List...) doesn't take (boolean needResult)
  • [HIVE-9633] - Add HCatClient.dropPartitions() overload to skip deletion of partition-directories.
  • [HIVE-9644] - Fold case & when udfs
  • [HIVE-9645] - Constant folding case NULL equality
  • [HIVE-9647] - Discrepancy in cardinality estimates between partitioned and un-partitioned tables
  • [HIVE-9648] - Null check key provider before doing set
  • [HIVE-9652] - Tez in place updates should detect redirection of STDERR
  • [HIVE-9655] - Dynamic partition table insertion error
  • [HIVE-9665] - Parallel move task optimization causes race condition
  • [HIVE-9667] - Disable ORC bloom filters for ORC v11 output-format
  • [HIVE-9674] - *DropPartitionEvent should handle partition-sets.
  • [HIVE-9679] - Remove redundant null-checks from DbNotificationListener.
  • [HIVE-9680] - GlobalLimitOptimizer is not checking filters correctly
  • [HIVE-9681] - Extend HiveAuthorizationProvider to support partition-sets.
  • [HIVE-9706] - HBase handler support for snapshots should confirm properties before use
  • [HIVE-9711] - ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN
  • [HIVE-9716] - Map job fails when table's LOCATION does not have scheme
  • [HIVE-9717] - The max/min function used by AggrStats for decimal type is not what we expected
  • [HIVE-9720] - Metastore does not properly migrate column stats when renaming a table across databases.
  • [HIVE-9721] - Hadoop23Shims.setFullFileStatus should check for null
  • [HIVE-9727] - GroupingID translation from Calcite
  • [HIVE-9731] - WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified
  • [HIVE-9734] - Correlating expression cannot contain unqualified column references
  • [HIVE-9735] - aggregate ( smalllint ) fails when ORC file used ava.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Short
  • [HIVE-9743] - Incorrect result set for vectorized left outer join
  • [HIVE-9749] - ObjectStore schema verification logic is incorrect
  • [HIVE-9754] - rename GenericUDFLevenstein to GenericUDFLevenshtein
  • [HIVE-9755] - Hive built-in "ngram" UDAF fails when a mapper has no matches.
  • [HIVE-9767] - Fixes in Hive UDF to be usable in Pig
  • [HIVE-9770] - Beeline ignores --showHeader for non-tablular output formats i.e csv,tsv,dsv
  • [HIVE-9772] - Hive parquet timestamp conversion doesn't work with new Parquet
  • [HIVE-9779] - ATSHook does not log the end user if doAs=false (it logs the hs2 server user)
  • [HIVE-9791] - insert into table throws NPE
  • [HIVE-9797] - Need update some spark tests for java 8
  • [HIVE-9813] - Hive JDBC - DatabaseMetaData.getColumns method cannot find classes added with "add jar" command
  • [HIVE-9817] - fix DateFormat pattern in hive-exec
  • [HIVE-9826] - Firing insert event fails on temporary table
  • [HIVE-9831] - HiveServer2 should use ConcurrentHashMap in ThreadFactory
  • [HIVE-9832] - Merge join followed by union and a map join in hive on tez fails.
  • [HIVE-9834] - VectorGroupByOperator logs too much
  • [HIVE-9836] - Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns)
  • [HIVE-9839] - HiveServer2 leaks OperationHandle on async queries which fail at compile phase
  • [HIVE-9841] - IOException thrown by ORC should include the path of processing file
  • [HIVE-9845] - HCatSplit repeats information making input split data size huge
  • [HIVE-9848] - readlink -f is GNU coreutils only (used in bin/hive)
  • [HIVE-9851] - org.apache.hadoop.hive.serde2.avro.AvroSerializer should use org.apache.avro.generic.GenericData.Array when serializing a list
  • [HIVE-9855] - Runtime skew join doesn't work when skewed data only exists in big table
  • [HIVE-9860] - MapredLocalTask/SecureCmdDoAs leaks local files
  • [HIVE-9866] - Changing a column's type doesn't change column stats type in metastore
  • [HIVE-9869] - Trunk doesn't build with hadoop-1
  • [HIVE-9873] - Hive on MR throws DeprecatedParquetHiveInput exception
  • [HIVE-9877] - Beeline cannot run multiple statements in the same row
  • [HIVE-9886] - Hive on tez: NPE when converting join to SMB in sub-query
  • [HIVE-9892] - various MSSQL upgrade scripts don't work
  • [HIVE-9908] - vectorization error binary type not supported, group by with binary columns
  • [HIVE-9915] - Allow specifying file format for managed tables
  • [HIVE-9919] - upgrade scripts don't work on some auto-created DBs due to absence of tables
  • [HIVE-9920] - DROP DATABASE IF EXISTS throws exception if database does not exist
  • [HIVE-9923] - No clear message when "from" is missing
  • [HIVE-9929] - StatsUtil#getAvailableMemory could return negative value
  • [HIVE-9930] - fix QueryPlan.makeQueryId time format
  • [HIVE-9932] - DDLTask.conf hides base class Task.conf
  • [HIVE-9934] - Vulnerability in LdapAuthenticationProviderImpl enables HiveServer2 client to degrade the authentication mechanism to "none", allowing authentication without password
  • [HIVE-9936] - fix potential NPE in DefaultUDAFEvaluatorResolver
  • [HIVE-9944] - Convert array[] to string properly in log messages
  • [HIVE-9945] - FunctionTask.conf hides Task.conf field
  • [HIVE-9947] - ScriptOperator replaceAll uses unescaped dot and result is not assigned
  • [HIVE-9948] - SparkUtilities.getFileName passes File.separator to String.split() method
  • [HIVE-9950] - fix rehash in CuckooSetBytes and CuckooSetLong
  • [HIVE-9951] - VectorizedRCFileRecordReader creates Exception but does not throw it
  • [HIVE-9952] - fix NPE in CorrelationUtilities
  • [HIVE-9953] - fix NPE in WindowingTableFunction
  • [HIVE-9954] - UDFJson uses the == operator to compare Strings
  • [HIVE-9955] - TestVectorizedRowBatchCtx compares byte[] using equals() method
  • [HIVE-9956] - use BigDecimal.valueOf instead of new in TestFileDump
  • [HIVE-9957] - Hive 1.1.0 not compatible with Hadoop 2.4.0
  • [HIVE-9961] - HookContext for view should return a table type of VIRTUAL_VIEW
  • [HIVE-9971] - Clean up operator class
  • [HIVE-9975] - Renaming a nonexisting partition should not throw out NullPointerException
  • [HIVE-9976] - Possible race condition in DynamicPartitionPruner for

New in Apache Hive 1.1.0 (May 19, 2015)

  • Sub-task:
  • [HIVE-7073] - Implement Binary in ParquetSerDe
  • [HIVE-8121] - Create micro-benchmarks for ParquetSerde and evaluate performance
  • [HIVE-8122] - Make use of SearchArgument classes for Parquet SERDE
  • [HIVE-8130] - Support Date in Avro
  • [HIVE-8131] - Support timestamp in Avro
  • [HIVE-8362] - Investigate flaky test parallel.q
  • [HIVE-8651] - CBO: sort column changed in infer_bucket_sort test
  • [HIVE-8707] - Fix ordering differences due to Java 8 HashMap function
  • [HIVE-8718] - Refactoring: move mapLocalWork field from MapWork to BaseWork
  • [HIVE-8773] - Fix TestWebHCatE2e#getStatus for Java8
  • [HIVE-8862] - Fix ordering diferences on TestParse tests due to Java8
  • [HIVE-8922] - CBO: assorted date and timestamp issues
  • [HIVE-8923] - HIVE-8512 needs to be fixed also for CBO
  • [HIVE-8936] - Add SORT_QUERY_RESULTS for join tests that do not guarantee order
  • [HIVE-8962] - Add SORT_QUERY_RESULTS for join tests that do not guarantee order #2
  • [HIVE-9030] - CBO: Plans with comparison of values with different types
  • [HIVE-9033] - Fix ordering differences due to Java8 (part 2)
  • [HIVE-9034] - CBO: type change in literal_ints.q
  • [HIVE-9035] - CBO: Disable PPD when functions are non-deterministic (ppd_random.q - non-deterministic udf rand() pushed above join)
  • [HIVE-9043] - HiveException: Conflict on row inspector for {table}
  • [HIVE-9066] - temporarily disable CBO for non-deterministic functions
  • [HIVE-9104] - windowing.q failed when mapred.reduce.tasks is set to larger than one
  • [HIVE-9109] - Add support for Java 8 specific q-test out files
  • [HIVE-9127] - Improve CombineHiveInputFormat.getSplit performance
  • [HIVE-9133] - CBO (Calcite Return Path): Refactor Semantic Analyzer to Move CBO code out
  • [HIVE-9153] - Perf enhancement on CombineHiveInputFormat and HiveInputFormat
  • [HIVE-9161] - Fix ordering differences on UDF functions due to Java8
  • [HIVE-9174] - Enable queuing of HCatalog notification events in metastore DB
  • [HIVE-9175] - Add alters to list of events handled by NotificationListener
  • [HIVE-9181] - Fix SkewJoinOptimizer related Java 8 ordering differences
  • [HIVE-9184] - Modify HCatClient to support new notification methods in HiveMetaStoreClient
  • [HIVE-9193] - Fix ordering differences due to Java 8 (Part 3)
  • [HIVE-9194] - Support select distinct *
  • [HIVE-9200] - CBO (Calcite Return Path): Inline Join, Properties
  • [HIVE-9206] - Fix Desc Formatted related Java 8 ordering differences
  • [HIVE-9211] - Research on build mini HoS cluster on YARN for unit test[Spark Branch]
  • [HIVE-9222] - Fix ordering differences due to Java 8 (Part 4)
  • [HIVE-9224] - CBO (Calcite Return Path): Inline Table, Properties
  • [HIVE-9239] - Fix ordering differences due to Java 8 (Part 5)
  • [HIVE-9241] - Fix TestCliDriver.testCliDriver_subquery_multiinsert
  • [HIVE-9257] - Merge from spark to trunk January 2015
  • [HIVE-9259] - Fix ClassCastException when CBO is enabled for HOS [Spark Branch]
  • [HIVE-9264] - Merge encryption branch to trunk
  • [HIVE-9292] - CBO (Calcite Return Path): Inline GroupBy, Properties
  • [HIVE-9315] - CBO (Calcite Return Path): Inline FileSinkOperator, Properties
  • [HIVE-9321] - Notification message size can be arbitrarily long, DbNotificationListener limits to 1024
  • [HIVE-9352] - Merge from spark to trunk (follow-up of HIVE-9257)
  • [HIVE-9409] - Avoid ser/de loggers as logging framework can be incompatible on driver and workers
  • [HIVE-9410] - ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]
  • [HIVE-9425] - Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
  • [HIVE-9428] - LocalSparkJobStatus may return failed job as successful [Spark Branch]
  • [HIVE-9431] - CBO (Calcite Return Path): Removing AST from ParseContext
  • [HIVE-9434] - Shim the method Path.getPathWithoutSchemeAndAuthority
  • [HIVE-9444] - CBO (Calcite Return Path): Rewrite GlobalLimitOptimizer
  • [HIVE-9449] - Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]
  • [HIVE-9450] - [Parquet] Check all data types work for Parquet in Group By operator
  • [HIVE-9477] - No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
  • [HIVE-9487] - Make Remote Spark Context secure [Spark Branch]
  • [HIVE-9493] - Failed job may not throw exceptions [Spark Branch]
  • Bug:
  • [HIVE-1344] - error in select disinct
  • [HIVE-1654] - select distinct should allow column name regex
  • [HIVE-1869] - TestMTQueries failing on jenkins
  • [HIVE-3781] - Index related events should be delivered to metastore event listener
  • [HIVE-4009] - CLI Tests fail randomly due to MapReduce LocalJobRunner race condition
  • [HIVE-5536] - Incorrect Operation Name is passed to hookcontext
  • [HIVE-5865] - AvroDeserializer incorrectly assumes keys to Maps will always be of type 'org.apache.avro.util.Utf8'
  • [HIVE-6165] - Unify HivePreparedStatement from jdbc:hive and jdbc:hive2
  • [HIVE-6308] - COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.
  • [HIVE-6421] - abs() should preserve precision/scale of decimal input
  • [HIVE-6623] - Add "owner" tag to ptest2 created instances
  • [HIVE-6683] - Beeline does not accept comments at end of line
  • [HIVE-6914] - parquet-hive cannot write nested map (map value is map)
  • [HIVE-7024] - Escape control characters for explain result
  • [HIVE-7069] - Zookeeper connection leak
  • [HIVE-7932] - It may cause NP exception when add accessed columns to ReadEntity
  • [HIVE-7951] - InputFormats implementing (Job)Configurable should not be cached
  • [HIVE-7997] - Potential null pointer reference in ObjectInspectorUtils#compareTypes()
  • [HIVE-8182] - beeline fails when executing multiple-line queries with trailing spaces
  • [HIVE-8257] - Accumulo introduces old hadoop-client dependency
  • [HIVE-8266] - create function using statement compilation should include resource URI entity
  • [HIVE-8284] - Equality comparison is done between two floating point variables in HiveRelMdUniqueKeys#getUniqueKeys()
  • [HIVE-8308] - Acid related table properties should be defined in one place and should be case insensitive
  • [HIVE-8317] - WebHCat pom should explicitly depend on jersey-core
  • [HIVE-8326] - Using DbTxnManager with concurrency off results in run time error
  • [HIVE-8330] - HiveResultSet.findColumn() parameters are case sensitive
  • [HIVE-8338] - Add ip and command to semantic analyzer hook context
  • [HIVE-8345] - q-test for Avro date support
  • [HIVE-8359] - Map containing null values are not correctly written in Parquet files
  • [HIVE-8381] - Update hive version on trunk to 0.15
  • [HIVE-8387] - add retry logic to ZooKeeperStorage in WebHCat
  • [HIVE-8448] - Union All might not work due to the type conversion issue
  • [HIVE-8450] - Create table like does not copy over table properties
  • [HIVE-8491] - Fix build name in ptest pre-commit message
  • [HIVE-8500] - beeline does not need to set hive.aux.jars.path
  • [HIVE-8512] - queries with star and gby produce incorrect results
  • [HIVE-8518] - Compile time skew join optimization returns duplicated results
  • [HIVE-8523] - Potential null dereference in DDLSemanticAnalyzer#addInputsOutputsAlterTable()
  • [HIVE-8556] - introduce overflow control and sanity check to BytesBytesMapJoin
  • [HIVE-8564] - DROP TABLE IF EXISTS throws exception if the table does not exist.
  • [HIVE-8565] - beeline may go into an infinite loop when using EOF
  • [HIVE-8576] - Guaranteed NPE in StatsRulesProcFactory
  • [HIVE-8594] - Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
  • [HIVE-8600] - Add option to log explain output for query
  • [HIVE-8610] - Compile time skew join optimization doesn't work with auto map join
  • [HIVE-8611] - grant/revoke syntax should support additional objects for authorization plugins
  • [HIVE-8612] - Support metadata result filter hooks
  • [HIVE-8613] - percentile_approx raise a comparator error
  • [HIVE-8627] - Compute stats on a table from impala caused the table to be corrupted
  • [HIVE-8634] - HiveServer2 fair scheduler queue mapping doesn't handle the secondary groups rules correctly
  • [HIVE-8636] - CBO: split cbo_correctness test
  • [HIVE-8666] - hive.metastore.server.max.threads default is too high
  • [HIVE-8680] - Set Max Message for Binary Thrift endpoints
  • [HIVE-8693] - Separate out fair scheduler dependency from hadoop 0.23 shim
  • [HIVE-8708] - Add query id to explain log option
  • [HIVE-8720] - Update orc_merge tests to make it consistent across OS'es
  • [HIVE-8728] - Fix ptf.q determinism
  • [HIVE-8730] - schemaTool failure when date partition has non-date value
  • [HIVE-8736] - add ordering to cbo_correctness to make result consistent
  • [HIVE-8757] - YARN dep in scheduler shim should be optional
  • [HIVE-8762] - HiveMetaStore.BooleanPointer should be replaced with an AtomicBoolean
  • [HIVE-8791] - Hive permission inheritance throws exception S3
  • [HIVE-8796] - TestCliDriver acid tests with decimal needs benchmark to be updated
  • [HIVE-8797] - Simultaneous dynamic inserts can result in "partition already exists" error
  • [HIVE-8803] - DESC SCHEMA is not working
  • [HIVE-8808] - HiveInputFormat caching cannot work with all input formats
  • [HIVE-8812] - TestMinimrCliDriver failure if run in the same command as TestHBaseNegativeCliDriver
  • [HIVE-8825] - SQLCompletor catches Throwable and ignores it
  • [HIVE-8847] - Fix bugs in jenkins scripts
  • [HIVE-8848] - data loading from text files or text file processing doesn't handle nulls correctly
  • [HIVE-8850] - ObjectStore:: rollbackTransaction() needs to be looked into further.
  • [HIVE-8863] - Cannot drop table with uppercase name after "compute statistics for columns"
  • [HIVE-8869] - RowSchema not updated for some ops when columns are pruned
  • [HIVE-8872] - Hive view of HBase range scan intermittently returns incorrect data.
  • [HIVE-8874] - Error Accessing HBase from Hive via Oozie on Kerberos 5.0.1 cluster
  • [HIVE-8875] - hive.optimize.sort.dynamic.partition should be turned off for ACID
  • [HIVE-8877] - improve context logging during job submission via WebHCat
  • [HIVE-8879] - Upgrade derby version to address race candition
  • [HIVE-8881] - Receiving json "{"error":"Could not find job job_1415748506143_0002"}" when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
  • [HIVE-8889] - JDBC Driver ResultSet.getXXXXXX(String columnLabel) methods Broken
  • [HIVE-8891] - Another possible cause to NucleusObjectNotFoundException from drops/rollback
  • [HIVE-8893] - Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
  • [HIVE-8901] - increase retry attempt, interval on metastore database errors
  • [HIVE-8909] - Hive doesn't correctly read Parquet nested types
  • [HIVE-8914] - HDFSCleanup thread holds reference to FileSystem
  • [HIVE-8916] - Handle user@domain username under LDAP authentication
  • [HIVE-8917] - HIVE-5679 adds two thread safety problems
  • [HIVE-8926] - Projections that only swap input columns are identified incorrectly as identity projections
  • [HIVE-8938] - Compiler should save the transform URI as input entity
  • [HIVE-8944] - TestCompactor fails with IncompatibleClassChangeError
  • [HIVE-8948] - TestStreaming is flaky
  • [HIVE-8964] - Some TestMiniTezCliDriver tests taking two hours
  • [HIVE-8965] - Enhance PTest to kill all processes between tests and to report when a TEST*.xml file is not generated
  • [HIVE-8967] - Fix bucketmapjoin7.q determinism
  • [HIVE-8975] - Possible performance regression on bucket_map_join_tez2.q
  • [HIVE-8978] - Fix test determinism issue for qfile: smb_mapjoin_1.q etc
  • [HIVE-8990] - mapjoin_mapjoin.q is failing on Tez (missed golden file update)
  • [HIVE-9001] - Ship with log4j.properties file that has a reliable time based rolling policy
  • [HIVE-9006] - hiveserver thrift api version is still 6
  • [HIVE-9011] - Fix parquet_join.q determinism
  • [HIVE-9024] - NullPointerException when starting webhcat server if templeton.hive.properties is not set
  • [HIVE-9032] - Help for orcfiledump script does not reflect new options
  • [HIVE-9048] - Hive build failed on hadoop-1 after HIVE-8828.
  • [HIVE-9055] - Tez: union all followed by group by followed by another union all gives error
  • [HIVE-9060] - Fix child operator references after NonBlockingOpDeDupProc
  • [HIVE-9077] - Set completer in CliDriver is not working
  • [HIVE-9096] - GenericUDF may be left unclosed in PartitionPrune#visitCall()
  • [HIVE-9113] - Explain on query failed with NPE
  • [HIVE-9120] - Hive Query log does not work when hive.exec.parallel is true
  • [HIVE-9122] - Need to remove additional references to hive-shims-common-secure, hive-shims-0.20
  • [HIVE-9129] - Migrate to newer Calcite snapshot, where ByteString is now in org.apache.calcite.avatica.util
  • [HIVE-9130] - vector_partition_diff_num_cols result is not updated after CBO upgrade
  • [HIVE-9131] - MiniTez optimize_nullscan test is unstable
  • [HIVE-9149] - Add unit test to test implicit conversion during dynamic partitioning/distribute by
  • [HIVE-9150] - Unrelated types are compared in GenTezWork#getFollowingWorkIndex()
  • [HIVE-9154] - Cache pathToPartitionInfo in context aware record reader
  • [HIVE-9177] - Fix child operator references after NonBlockingOpDeDupProc (II)
  • [HIVE-9195] - CBO changes constant to column type
  • [HIVE-9197] - fix lvj_mapjoin.q diff in trunk
  • [HIVE-9199] - Excessive exclusive lock used in some DDLs with DummyTxnManager
  • [HIVE-9203] - CREATE TEMPORARY FUNCTION hangs trying to acquire lock
  • [HIVE-9215] - Some mapjoin queries broken with IdentityProjectRemover with PPD
  • [HIVE-9221] - Remove deprecation warning for hive.metastore.local
  • [HIVE-9242] - Many places in CBO code eat exceptions
  • [HIVE-9243] - Static Map in IOContext is not thread safe
  • [HIVE-9255] - Fastpath for limited fetches from unpartitioned tables
  • [HIVE-9296] - Need to add schema upgrade changes for queueing events in the database
  • [HIVE-9299] - Reuse Configuration in AvroSerdeUtils
  • [HIVE-9300] - Make TCompactProtocol configurable
  • [HIVE-9301] - Potential null dereference in MoveTask#createTargetPath()
  • [HIVE-9309] - schematool fails on Postgres 8.1
  • [HIVE-9310] - CLI JLine does not flush history back to ~/.hivehistory
  • [HIVE-9316] - TestSqoop tests in WebHCat testsuite hardcode libdir path to hdfs
  • [HIVE-9330] - DummyTxnManager will throw NPE if WriteEntity writeType has not been set
  • [HIVE-9331] - get rid of pre-optimized-hashtable memory optimizations
  • [HIVE-9344] - Fix flaky test optimize_nullscan
  • [HIVE-9347] - Bug with max() together with rank() and grouping sets
  • [HIVE-9351] - Running Hive Jobs with Tez cause templeton to never report percent complete
  • [HIVE-9353] - make TABLE keyword optional in INSERT INTO TABLE foo...
  • [HIVE-9366] - wrong date in description annotation in date_add() and date_sub() udf
  • [HIVE-9369] - fix arguments length checking in Upper and Lower UDF
  • [HIVE-9377] - UDF in_file() in WHERE predicate causes NPE.
  • [HIVE-9381] - HCatalog hardcodes maximum append limit to 1000.
  • [HIVE-9382] - Query got rerun with Global Limit optimization on and Fetch optimization off
  • [HIVE-9386] - FileNotFoundException when using in_file()
  • [HIVE-9393] - reduce noisy log level of ColumnarSerDe.java:116 from INFO to DEBUG
  • [HIVE-9396] - date_add()/date_sub() should allow tinyint/smallint/bigint arguments in addition to int
  • [HIVE-9414] - Fixup post HIVE-9264 - Merge encryption branch to trunk
  • [HIVE-9437] - Beeline does not add any existing HADOOP_CLASSPATH
  • [HIVE-9440] - Folders may not be pruned for Hadoop 2
  • [HIVE-9441] - Remove call to deprecated Calcite method
  • [HIVE-9443] - ORC PPD - fix fuzzy case evaluation of IS_NULL
  • [HIVE-9445] - Revert HIVE-5700 - enforce single date format for partition column storage
  • [HIVE-9446] - JDBC DatabaseMetadata.getColumns() does not work for temporary tables
  • [HIVE-9448] - Merge spark to trunk 1/23/15
  • [HIVE-9454] - Test failures due to new Calcite version
  • [HIVE-9462] - HIVE-8577 - breaks type evolution
  • [HIVE-9475] - HiveMetastoreClient.tableExists does not work
  • [HIVE-9476] - Beeline fails to start on trunk
  • [HIVE-9502] - Parquet cannot read Map types from files written with Hive

New in Apache Hive 1.0.0 (Feb 5, 2015)

  • Bug:
  • [HIVE-5631] - Index creation on a skew table fails
  • [HIVE-5664] - Drop cascade database fails when the db has any tables with indexes
  • [HIVE-6198] - ORC file and struct column names are case sensitive
  • [HIVE-6468] - HS2 & Metastore using SASL out of memory error when curl sends a get request
  • [HIVE-7270] - SerDe Properties are not considered by show create table Command
  • [HIVE-8099] - IN operator for partition column fails when the partition column type is DATE
  • [HIVE-8295] - Add batch retrieve partition objects for metastore direct sql
  • [HIVE-8374] - schematool fails on Postgres versions < 9.2
  • [HIVE-8485] - HMS on Oracle incompatibility
  • [HIVE-8706] - Table statistic collection on counter failed due to table name character case.
  • [HIVE-8715] - Hive 14 upgrade scripts can fail for statistics if database was created using auto-create
  • [HIVE-8739] - handle Derby and Oracle errors with joins and filters in Direct SQL in a invalid-DB-specific path
  • [HIVE-8784] - Querying partition does not work with JDO enabled against PostgreSQL
  • [HIVE-8805] - CBO skipped due to SemanticException: Line 0:-1 Both left and right aliases encountered in JOIN 'avg_cs_ext_discount_amt'
  • [HIVE-8807] - Obsolete default values in webhcat-default.xml
  • [HIVE-8811] - Dynamic partition pruning can result in NPE during query compilation
  • [HIVE-8827] - Remove SSLv2Hello from list of disabled protocols
  • [HIVE-8830] - hcatalog process don't exit because of non daemon thread
  • [HIVE-8845] - Switch to Tez 0.5.2
  • [HIVE-8866] - Vectorization on partitioned table throws ArrayIndexOutOfBoundsException when partitions are not of same #of columns
  • [HIVE-8870] - errors when selecting a struct field within an array from ORC based tables
  • [HIVE-8873] - Switch to calcite 0.9.2
  • [HIVE-8876] - incorrect upgrade script for Oracle (13->14)
  • [HIVE-8880] - non-synchronized access to split list in OrcInputFormat
  • [HIVE-8886] - Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup
  • [HIVE-8888] - Mapjoin with LateralViewJoin generates wrong plan in Tez
  • [HIVE-8947] - HIVE-8876 also affects Postgres < 9.2
  • [HIVE-8966] - Delta files created by hive hcatalog streaming cannot be compacted
  • [HIVE-9003] - Vectorized IF expr broken for the scalar and scalar case
  • [HIVE-9025] - join38.q (without map join) produces incorrect result when testing with multiple reducers
  • [HIVE-9038] - Join tests fail on Tez
  • [HIVE-9051] - TezJobMonitor in-place updates logs too often to logfile
  • [HIVE-9053] - select constant in union all followed by group by gives wrong result
  • [HIVE-9067] - OrcFileMergeOperator may create merge file that does not match properties of input files
  • [HIVE-9090] - Rename "Tez File Merge Work" to smaller name
  • [HIVE-9108] - Fix for HIVE-8735 is incorrect (stats with long paths)
  • [HIVE-9111] - Potential NPE in OrcStruct for list and map types
  • [HIVE-9112] - Query may generate different results depending on the number of reducers
  • [HIVE-9114] - union all query in cbo test has undefined ordering
  • [HIVE-9126] - Backport HIVE-8827 (Remove SSLv2Hello from list of disabled protocols) to 0.14 branch
  • [HIVE-9141] - HiveOnTez: mix of union all, distinct, group by generates error
  • [HIVE-9155] - HIVE_LOCKS uses int instead of bigint hive-txn-schema-0.14.0.mssql.sql
  • [HIVE-9162] - stats19 test is environment-dependant
  • [HIVE-9166] - Place an upper bound for SARG CNF conversion
  • [HIVE-9168] - Vectorized Coalesce for strings is broken
  • [HIVE-9205] - Change default tez install directory to use /tmp instead of /user and create the directory if it does not exist
  • [HIVE-9234] - HiveServer2 leaks FileSystem objects in FileSystem.CACHE
  • [HIVE-9249] - java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.HiveVarcharWritable cannot be cast to org.apache.hadoop.hive.common.type.HiveVarchar when joining tables
  • [HIVE-9278] - Cached expression feature broken in one case
  • [HIVE-9317] - move Microsoft copyright to NOTICE file
  • [HIVE-9359] - Export of a large table causes OOM in Metastore and Client
  • [HIVE-9361] - Intermittent NPE in SessionHiveMetaStoreClient.alterTempTable
  • [HIVE-9390] - Enhance retry logic wrt DB access in TxnHandler
  • [HIVE-9401] - Backport: Fastpath for limited fetches from unpartitioned tables
  • [HIVE-9404] - NPE in org.apache.hadoop.hive.metastore.txn.TxnHandler.determineDatabaseProduct()
  • [HIVE-9436] - RetryingMetaStoreClient does not retry JDOExceptions
  • [HIVE-9473] - sql std auth should disallow built-in udfs that allow any java methods to be called
  • [HIVE-9514] - schematool is broken in hive 1.0.0
  • Improvement:
  • [HIVE-3280] - Make HiveMetaStoreClient a public API
  • [HIVE-8933] - Check release builds for SNAPSHOT dependencies
  • Task:
  • [HIVE-6977] - Delete Hiveserver1

New in Apache Hive 0.14.0 (Feb 5, 2015)

  • Sub-task:
  • [HIVE-4629] - HS2 should support an API to retrieve query logs
  • [HIVE-5176] - Wincompat : Changes for allowing various path compatibilities with Windows
  • [HIVE-5179] - Wincompat : change script tests from bash to sh
  • [HIVE-5338] - TestJdbcDriver2 is failing on trunk.
  • [HIVE-5760] - Add vectorized support for CHAR/VARCHAR data types
  • [HIVE-5998] - Add vectorized reader for Parquet files
  • [HIVE-6031] - explain subquery rewrite for where clause predicates
  • [HIVE-6123] - Implement checkstyle in maven
  • [HIVE-6252] - sql std auth - support 'with admin option' in revoke role metastore api
  • [HIVE-6290] - Add support for hbase filters for composite keys
  • [HIVE-6367] - Implement Decimal in ParquetSerde
  • [HIVE-6394] - Implement Timestmap in ParquetSerde
  • [HIVE-6445] - Add qop support for kerberos over http in HiveServer2
  • [HIVE-6626] - Hive does not expand the DOWNLOADED_RESOURCES_DIR path
  • [HIVE-6627] - HiveServer2 should handle scratch dir permissions / errors in a better way
  • [HIVE-6714] - Fix getMapSize() of LazyMap
  • [HIVE-6735] - Make scalable dynamic partitioning work in vectorized mode
  • [HIVE-6760] - Scalable dynamic partitioning should bail out properly for list bucketing
  • [HIVE-6761] - Hashcode computation does not use maximum parallelism for scalable dynamic partitioning
  • [HIVE-6815] - Version of the HIVE-6374 for Hive 0.13
  • [HIVE-6982] - Export all .sh equivalent for windows (.cmd files) in bin, bin/ext
  • [HIVE-6993] - Update hive for Tez VertexLocationHint and getAVailableResource API changes
  • [HIVE-7029] - Vectorize ReduceWork
  • [HIVE-7078] - Need file sink operators that work with ACID
  • [HIVE-7094] - Separate out static/dynamic partitioning code in FileRecordWriterContainer
  • [HIVE-7156] - Group-By operator stat-annotation only uses distinct approx to generate rollups
  • [HIVE-7184] - TestHadoop20SAuthBridge no longer compiles after HADOOP-10448
  • [HIVE-7204] - Use NULL vertex location hint for Prewarm DAG vertices
  • [HIVE-7262] - Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
  • [HIVE-7286] - Parameterize HCatMapReduceTest for testing against all Hive storage formats
  • [HIVE-7291] - Refactor TestParser to understand test-property file
  • [HIVE-7350] - Changes related to TEZ-692, TEZ-1169, TEZ-1234
  • [HIVE-7357] - Add vectorized support for BINARY data type
  • [HIVE-7398] - Parent GBY of MUX is removed even it's not for semijoin
  • [HIVE-7404] - Revoke privilege should support revoking of grant option
  • [HIVE-7405] - Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
  • [HIVE-7420] - Parameterize tests for HCatalog Pig interfaces for testing against all storage formats
  • [HIVE-7427] - Changes for EdgeConfigurations
  • [HIVE-7457] - Minor HCatalog Pig Adapter test clean up
  • [HIVE-7491] - Stats annotation fails to evaluate constant expressions in filter operator
  • [HIVE-7513] - Add ROW__ID VirtualColumn
  • [HIVE-7535] - Make use of number of nulls column statistics in filter rule
  • [HIVE-7536] - Make use of decimal column statistics in statistics annotation
  • [HIVE-7544] - Changes related to TEZ-1288 (FastTezSerialization)
  • [HIVE-7548] - Precondition checks should not fail the merge task in case of automatic trigger
  • [HIVE-7571] - RecordUpdater should read virtual columns from row
  • [HIVE-7589] - Some fixes and improvements to statistics annotation rules
  • [HIVE-7601] - Bring up tez-branch upto the API changes from TEZ-1058, TEZ-1303, TEZ-1346, TEZ-1041
  • [HIVE-7639] - Bring tez-branch upto api changes in TEZ-1379, TEZ-1057, TEZ-1382
  • [HIVE-7646] - Modify parser to support new grammar for Insert,Update,Delete
  • [HIVE-7655] - CBO: Reading of partitioned table stats slows down explain
  • [HIVE-7656] - Bring tez-branch up-to the API changes made by TEZ-1372
  • [HIVE-7663] - OrcRecordUpdater needs to implement getStats
  • [HIVE-7679] - JOIN operator should update the column stats when number of rows changes
  • [HIVE-7734] - Join stats annotation rule is not updating columns statistics correctly
  • [HIVE-7735] - Implement Char, Varchar in ParquetSerDe
  • [HIVE-7788] - Generate plans for insert, update, and delete
  • [HIVE-7790] - Update privileges to check for update and delete
  • [HIVE-7808] - Changes to work against Tez-0.5 RC
  • [HIVE-7809] - Fix ObjectRegistry to work with Tez 0.5
  • [HIVE-7820] - union_null.q is not deterministic
  • [HIVE-7825] - Bring tez-branch up-to the API changes made by TEZ-1472, TEZ-1469
  • [HIVE-7836] - Ease-out denominator for multi-attribute join case in statistics annotation
  • [HIVE-7864] - [CBO] Query fails if it refers only partitioning column
  • [HIVE-7869] - Build long running HS2 test framework
  • [HIVE-7904] - Missing null check cause NPE when updating join column stats in statistics annotation
  • [HIVE-7905] - CBO: more cost model changes
  • [HIVE-7907] - Bring up tez branch to changes in TEZ-1038, TEZ-1500
  • [HIVE-7935] - Support dynamic service discovery for HiveServer2
  • [HIVE-7979] - Fix testconfiguration.property file in Tez branch
  • [HIVE-7990] - With fetch column stats disabled number of elements in grouping set is not taken into account
  • [HIVE-7991] - Incorrect calculation of number of rows in JoinStatsRule.process results in overflow
  • [HIVE-7992] - StatsRulesProcFactory should gracefully handle overflows
  • [HIVE-7994] - BMJ test fails on tez
  • [HIVE-7995] - Column statistics from expression does not handle fields within complex types
  • [HIVE-8003] - CBO: Handle Literal casting, Restrict CBO to select queries, Translate Strings, Optiq Log
  • [HIVE-8006] - CBO Trunk Merge: Test fail that includes Table Sample, rows(), query hints
  • [HIVE-8016] - CBO: PPD to honor hive Join Cond, Casting fixes, Add annotations for IF, Code cleanup
  • [HIVE-8021] - CBO: support CTAS and insert ... select
  • [HIVE-8046] - CBO: fix issues with Windowing queries
  • [HIVE-8069] - CBO: RowResolver after SubQuery predicate handling should be reset to outer query block RR
  • [HIVE-8076] - CBO Trunk Merge: Test Failure input23
  • [HIVE-8111] - CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
  • [HIVE-8125] - CBO Trunk Merge: On Failure Fall Back to Non CBO
  • [HIVE-8144] - CBO: HiveProjectRel factory should create RelSubSets
  • [HIVE-8145] - CBO: bail from Optiq planning if a Select list contains multiple references to the same name
  • [HIVE-8159] - CBO: bail from Optiq planning if a Select list contains multiple references to the same name
  • [HIVE-8168] - With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation)
  • [HIVE-8172] - HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace
  • [HIVE-8173] - HiveServer2 dynamic service discovery: figure out best ZooKeeper ACLs for security
  • [HIVE-8186] - Self join may fail if one side have virtual column(s) and other doesn't
  • [HIVE-8193] - Hook HiveServer2 dynamic service discovery with session time out
  • [HIVE-8194] - CBO: bail for having clause referring select expr aliases
  • [HIVE-8199] - CBO Trunk Merge: quote2 test fails due to incorrect literal translation
  • [HIVE-8223] - CBO Trunk Merge: partition_wise_fileformat2 select result depends on ordering
  • [HIVE-8228] - CBO: fix couple of issues with partition pruning
  • [HIVE-8237] - CBO: Use Fully qualified table name (db.tablename in ReloptHiveTable)
  • [HIVE-8288] - HiveServer2 dynamic discovery should create znodes organized by version number & add support for removing server uri's of a particular version from the server script.
  • [HIVE-8309] - CBO: Fix OB by removing constraining DT, Use external names for col Aliases, Remove unnecessary Selects, Make DT Name counter query specific
  • [HIVE-8377] - Enable Kerberized SSL for HiveServer2 in http mode
  • [HIVE-8454] - Select Operator does not rename column stats properly in case of select star
  • [HIVE-8522] - CBO: Update Calcite Version to 0.9.2-incubating-SNAPSHOT
  • [HIVE-8530] - CBO: Preserve types of literals
  • [HIVE-8549] - NPE in PK-FK inference when one side of join is complex tree
  • [HIVE-8580] - Support LateralViewJoinOperator and LateralViewForwardOperator in stats annotation
  • [HIVE-8582] - CBO: Outer Join Simplification is broken
  • [HIVE-8653] - CBO: Push Semi Join through, Project/Filter/Join
  • [HIVE-8654] - CBO: parquet_ctas test returns incorrect results
  • [HIVE-8655] - CBO: ppr_pushdown, udf_substr produces incorrect results due to broken tablesample handling
  • [HIVE-8656] - CBO: auto_join_filters fails
  • [HIVE-8657] - CBO: inputddl5, udf_reverse tests fail
  • [HIVE-8662] - CBO: tez_dml fails
  • [HIVE-8768] - CBO: Fix filter selectivity for "in clause" & ""
  • Bug:
  • [HIVE-1363] - 'SHOW TABLE EXTENDED LIKE' command does not strip single/double quotes
  • [HIVE-1879] - Remove hive.metastore.metadb.dir property from hive-default.xml and HiveConf
  • [HIVE-2390] - Add UNIONTYPE serialization support to LazyBinarySerDe
  • [HIVE-2597] - Repeated key in GROUP BY is erroneously displayed when using DISTINCT
  • [HIVE-2638] - Tests fail when Hive is run against Hadoop 0.23
  • [HIVE-3392] - Hive unnecessarily validates table SerDes when dropping a table
  • [HIVE-3925] - dependencies of fetch task are not shown by explain
  • [HIVE-4064] - Handle db qualified names consistently across all HiveQL statements
  • [HIVE-4118] - ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails when using fully qualified table name
  • [HIVE-4274] - Table created using HCatalog java client doesn't set the owner
  • [HIVE-4561] - Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
  • [HIVE-4576] - templeton.hive.properties does not allow values with commas
  • [HIVE-4723] - DDLSemanticAnalyzer.addTablePartsOutputs eats several exceptions
  • [HIVE-4795] - Delete/Alter/Describe actions fail when SerDe is not on class path
  • [HIVE-4965] - Add support so that PTFs can stream their output; Windowing PTF should do this
  • [HIVE-5077] - Provide an option to run local task in process
  • [HIVE-5092] - Fix hiveserver2 mapreduce local job on Windows
  • [HIVE-5336] - HCatSchema.remove(HCatFieldSchema hcatFieldSchema) should renumber the fieldPositionMap and the fieldPositionMap should not be cached by the end user
  • [HIVE-5339] - TestJdbcDriver2 is failing on trunk.
  • [HIVE-5376] - Hive does not honor type for partition columns when altering column type
  • [HIVE-5456] - Queries fail on avro backed table with empty partition
  • [HIVE-5607] - Hive fails to parse the "%" (mod) sign after brackets.
  • [HIVE-5677] - Beeline warns about unavailable files if HIVE_OPTS is set
  • [HIVE-5789] - WebHCat E2E tests do not launch on Windows
  • [HIVE-5847] - DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
  • [HIVE-5870] - Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
  • [HIVE-6035] - Windows: percentComplete returned by job status from WebHCat is null
  • [HIVE-6093] - table creation should fail when user does not have permissions on db
  • [HIVE-6149] - TestJdbcDriver2 is unable to drop a database created from previous runs ("hbasedb")
  • [HIVE-6176] - Beeline gives bogus error message if an unaccepted command line option is given
  • [HIVE-6187] - Add test to verify that DESCRIBE TABLE works with quoted table names
  • [HIVE-6200] - Hive custom SerDe cannot load DLL added by "ADD FILE" command
  • [HIVE-6245] - HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
  • [HIVE-6305] - test use of quoted identifiers in user/role names
  • [HIVE-6313] - Minimr tests in hadoop-1 hangs on shutdown
  • [HIVE-6321] - hiveserver2 --help says Unrecognized option: -h
  • [HIVE-6322] - Fix file_with_header_footer_negative.q
  • [HIVE-6331] - HIVE-5279 deprecated UDAF class without explanation/documentation/alternative
  • [HIVE-6374] - Hive job submitted with non-default name node (fs.default.name) doesn't process locations properly
  • [HIVE-6437] - DefaultHiveAuthorizationProvider should not initialize a new HiveConf
  • [HIVE-6446] - Ability to specify hadoop.bin.path from command line -D
  • [HIVE-6447] - Bucket map joins in hive-tez
  • [HIVE-6480] - Metastore server startup script ignores ENV settings
  • [HIVE-6487] - PTest2 do not copy failed source directories
  • [HIVE-6508] - Mismatched results between vector and non-vector mode with decimal field
  • [HIVE-6511] - casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
  • [HIVE-6515] - Custom vertex in hive-tez should be able to accept multiple MR-inputs
  • [HIVE-6521] - WebHCat cannot fetch correct percentComplete for Hive jobs
  • [HIVE-6531] - Runtime errors in vectorized execution.
  • [HIVE-6538] - yet another annoying exception in test logs
  • [HIVE-6549] - remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh
  • [HIVE-6550] - SemanticAnalyzer.reset() doesn't clear all the state
  • [HIVE-6555] - TestSchemaTool is failing on trunk after branching
  • [HIVE-6560] - varchar and char types cannot be cast to binary
  • [HIVE-6563] - hdfs jar being pulled in when creating a hadoop-2 based hive tar ball
  • [HIVE-6564] - WebHCat E2E tests that launch MR jobs fail on check job completion timeout
  • [HIVE-6569] - HCatalog still has references to deprecated property hive.metastore.local
  • [HIVE-6570] - Hive variable substitution does not work with the "source" command
  • [HIVE-6571] - query id should be available for logging during query compilation
  • [HIVE-6583] - wrong sql comments : ----... instead of -- ---...
  • [HIVE-6586] - Update parameters in HiveConf.java after commit HIVE-6037
  • [HIVE-6592] - WebHCat E2E test abort when pointing to https url of webhdfs
  • [HIVE-6594] - UnsignedInt128 addition does not increase internal int array count resulting in corrupted values during serialization
  • [HIVE-6597] - WebHCat E2E tests doAsTests_6 and doAsTests_7 need to be updated
  • [HIVE-6601] - alter database commands should support schema synonym keyword
  • [HIVE-6602] - Multi-user HiveServer2 throws error
  • [HIVE-6612] - Misspelling "schemaTool completeted"
  • [HIVE-6620] - UDF printf doesn't take either CHAR or VARCHAR as the first argument
  • [HIVE-6622] - UDF translate doesn't take either CHAR or VARCHAR as any of its arguments
  • [HIVE-6637] - UDF in_file() doesn't take CHAR or VARCHAR as input
  • [HIVE-6648] - Permissions are not inherited correctly when tables have multiple partition columns
  • [HIVE-6652] - Beeline gives evasive error message for any unrecognized command line arguement
  • [HIVE-6669] - sourcing txn-script from schema script results in failure for mysql & oracle
  • [HIVE-6684] - Beeline does not accept comments that are preceded by spaces
  • [HIVE-6695] - bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
  • [HIVE-6698] - hcat.py script does not correctly load the hbase storage handler jars
  • [HIVE-6707] - Lazy maps are broken (LazyMap and LazyBinaryMap)
  • [HIVE-6709] - HiveServer2 help command is not recognizing properly.
  • [HIVE-6711] - ORC maps uses getMapSize() from MapOI which is unreliable
  • [HIVE-6715] - Hive JDBC should include username into open session request for non-sasl connection
  • [HIVE-6724] - HCatStorer throws ClassCastException while storing tinyint/smallint data
  • [HIVE-6726] - Hcat cli does not close SessionState
  • [HIVE-6741] - HiveServer2 startup fails in secure (kerberos) mode due to backward incompatible hadoop change
  • [HIVE-6745] - HCat MultiOutputFormat hardcodes DistributedCache keynames
  • [HIVE-6756] - alter table set fileformat should set serde too
  • [HIVE-6768] - remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
  • [HIVE-6773] - Update readme for ptest2 framework
  • [HIVE-6782] - HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
  • [HIVE-6783] - Incompatible schema for maps between parquet-hive and parquet-pig
  • [HIVE-6784] - parquet-hive should allow column type change
  • [HIVE-6785] - query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe
  • [HIVE-6788] - Abandoned opened transactions not being timed out
  • [HIVE-6792] - hive.warehouse.subdir.inherit.perms doesn't work correctly in CTAS
  • [HIVE-6793] - DDLSemanticAnalyzer.analyzeShowRoles() should use HiveAuthorizationTaskFactory
  • [HIVE-6807] - add HCatStorer ORC test to test missing columns
  • [HIVE-6811] - LOAD command does not work with relative paths on Windows
  • [HIVE-6817] - Some hadoop2-only tests need diffs to be updated
  • [HIVE-6820] - HiveServer(2) ignores HIVE_OPTS
  • [HIVE-6822] - TestAvroSerdeUtils fails with -Phadoop-2
  • [HIVE-6824] - Hive HBase query fails on Tez due to missing jars - part 2
  • [HIVE-6826] - Hive-tez has issues when different partitions work off of different input types
  • [HIVE-6828] - Hive tez bucket map join conversion interferes with map join conversion
  • [HIVE-6835] - Reading of partitioned Avro data fails if partition schema does not match table schema
  • [HIVE-6843] - INSTR for UTF-8 returns incorrect position
  • [HIVE-6847] - Improve / fix bugs in Hive scratch dir setup
  • [HIVE-6853] - show create table for hbase tables should exclude LOCATION
  • [HIVE-6858] - Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
  • [HIVE-6861] - more hadoop2 only golden files to fix
  • [HIVE-6862] - add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
  • [HIVE-6868] - Create table in HCatalog sets different SerDe defaults than what is set through the CLI
  • [HIVE-6870] - Fix maven.repo.local setting in Hive build
  • [HIVE-6871] - Build fixes to allow Windows to run TestCliDriver
  • [HIVE-6877] - TestOrcRawRecordMerger is deleting test.tmp.dir
  • [HIVE-6880] - TestHWISessionManager fails with -Phadoop-2
  • [HIVE-6883] - Dynamic partitioning optimization does not honor sort order or order by
  • [HIVE-6884] - HiveLockObject and enclosed HiveLockObjectData override equal() method but didn't do so for hashcode()
  • [HIVE-6888] - Hive leaks MapWork objects via Utilities::gWorkMap
  • [HIVE-6890] - Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side
  • [HIVE-6891] - Alter rename partition Perm inheritance and general partition/table group inheritance
  • [HIVE-6898] - Functions in hive are failing with java.lang.ClassNotFoundException on Tez
  • [HIVE-6900] - HostUtil.getTaskLogUrl signature change causes compilation to fail
  • [HIVE-6901] - Explain plan doesn't show operator tree for the fetch operator
  • [HIVE-6908] - TestThriftBinaryCLIService.testExecuteStatementAsync has intermittent failures
  • [HIVE-6910] - Invalid column access info for partitioned table
  • [HIVE-6913] - Hive unable to find the hashtable file during complex multi-staged map join
  • [HIVE-6915] - Hive Hbase queries fail on secure Tez cluster
  • [HIVE-6916] - Export/import inherit permissions from parent directory
  • [HIVE-6919] - hive sql std auth select query fails on partitioned tables
  • [HIVE-6921] - index creation fails with sql std auth turned on
  • [HIVE-6922] - NullPointerException in collect_set() UDAF
  • [HIVE-6927] - Add support for MSSQL in schematool
  • [HIVE-6928] - Beeline should not chop off "describe extended" results by default
  • [HIVE-6931] - Windows unit test fixes
  • [HIVE-6932] - hive README needs update
  • [HIVE-6934] - PartitionPruner doesn't handle top level constant expression correctly
  • [HIVE-6936] - Provide table properties to InputFormats
  • [HIVE-6937] - Fix test reporting url's after jenkins move from bigtop
  • [HIVE-6939] - TestExecDriver.testMapRedPlan3 fails on hadoop-2
  • [HIVE-6944] - WebHCat e2e tests broken by HIVE-6432
  • [HIVE-6945] - issues with dropping partitions on Oracle
  • [HIVE-6946] - Make it easier to run WebHCat e2e tests
  • [HIVE-6947] - More fixes for tests on hadoop-2
  • [HIVE-6952] - Hive 0.13 HiveOutputFormat breaks backwards compatibility
  • [HIVE-6954] - After ALTER FILEFORMAT, DESCRIBE throwing exception
  • [HIVE-6955] - ExprNodeColDesc isSame doesn't account for tabAlias: this affects trait Propagation in Joins
  • [HIVE-6956] - Duplicate partitioning column for union when dynamic partition sort optimization is enabled
  • [HIVE-6957] - SQL authorization does not work with HS2 binary mode and Kerberos auth
  • [HIVE-6959] - Enable Constant propagation optimizer for Hive Vectorization
  • [HIVE-6960] - Set Hive pom to use Hadoop-2.4
  • [HIVE-6961] - Drop partitions treats partition columns as strings
  • [HIVE-6965] - Transaction manager should use RDBMS time instead of machine time
  • [HIVE-6966] - More fixes for TestCliDriver on Windows
  • [HIVE-6967] - Hive transaction manager fails when SQLServer is used as an RDBMS
  • [HIVE-6968] - list bucketing feature does not update the location map for unpartitioned tables
  • [HIVE-6972] - jdbc HTTP configuration options should be part of sessionConf part of connection string
  • [HIVE-6976] - Show query id only when there's jobs on the cluster
  • [HIVE-6978] - beeline always exits with 0 status, should exit with non-zero status on error
  • [HIVE-6979] - Hadoop-2 test failures related to quick stats not being populated correctly
  • [HIVE-6984] - Analyzing partitioned table with NULL values for the partition column failed with NPE
  • [HIVE-6985] - sql std auth - privileges grants to public role not being honored
  • [HIVE-6986] - MatchPath fails with small resultExprString
  • [HIVE-6987] - Metastore qop settings won't work with Hadoop-2.4
  • [HIVE-6989] - Error with arithmetic operators with javaXML serialization
  • [HIVE-6995] - GenericUDFBridge should log exception when it is unable to instantiate UDF object
  • [HIVE-6996] - FS based stats broken with indexed tables
  • [HIVE-7001] - fs.permissions.umask-mode is getting unset when Session is started
  • [HIVE-7003] - Fix typo in README
  • [HIVE-7004] - Fix more unit test failures on hadoop-2
  • [HIVE-7005] - MiniTez tests have non-deterministic explain plans
  • [HIVE-7006] - Fix ql_rewrite_gbtoidx.q output file
  • [HIVE-7009] - HIVE_USER_INSTALL_DIR could not bet set to non-HDFS filesystem
  • [HIVE-7011] - HiveInputFormat's split generation isn't thread safe
  • [HIVE-7012] - Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
  • [HIVE-7015] - Failing to inherit group/permission should not fail the operation
  • [HIVE-7016] - Hive returns wrong results when execute UDF on top of DISTINCT column
  • [HIVE-7017] - Insertion into Parquet tables fails under Tez
  • [HIVE-7023] - Bucket mapjoin is broken when the number of small aliases is two or more
  • [HIVE-7027] - Hive job fails when referencing a view that explodes an array
  • [HIVE-7030] - Remove hive.hadoop.classpath from hiveserver2.cmd
  • [HIVE-7031] - Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS
  • [HIVE-7033] - grant statements should check if the role exists
  • [HIVE-7035] - Templeton returns 500 for user errors - when job cannot be found
  • [HIVE-7037] - Add additional tests for transform clauses with Tez
  • [HIVE-7041] - DoubleWritable/ByteWritable should extend their hadoop counterparts
  • [HIVE-7042] - Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2
  • [HIVE-7043] - When using the tez session pool via hive, once sessions time out, all queries go to the default queue
  • [HIVE-7045] - Wrong results in multi-table insert aggregating without group by clause
  • [HIVE-7050] - Display table level column stats in DESCRIBE FORMATTED TABLE
  • [HIVE-7051] - Display partition level column stats in DESCRIBE FORMATTED PARTITION
  • [HIVE-7052] - Optimize split calculation time
  • [HIVE-7053] - Unable to fetch column stats from decimal columns
  • [HIVE-7055] - config not propagating for PTFOperator
  • [HIVE-7057] - webhcat e2e deployment scripts don't have x bit set
  • [HIVE-7060] - Column stats give incorrect min and distinct_count
  • [HIVE-7061] - sql std auth - insert queries without overwrite should not require delete privileges
  • [HIVE-7062] - Support Streaming mode in Windowing
  • [HIVE-7063] - Optimize for the Top N within a Group use case
  • [HIVE-7065] - Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
  • [HIVE-7066] - hive-exec jar is missing avro core
  • [HIVE-7067] - Min() and Max() on Timestamp and Date columns for ORC returns wrong results
  • [HIVE-7071] - Use custom Tez split generator to support schema evolution
  • [HIVE-7072] - HCatLoader only loads first region of hbase table
  • [HIVE-7075] - JsonSerde raises NullPointerException when object key is not lower case
  • [HIVE-7076] - Plugin (exec hook) to log to application timeline data to Yarn
  • [HIVE-7077] - Hive contrib compilation maybe broken with removal of org.apache.hadoop.record
  • [HIVE-7079] - Hive logs errors about missing tables when parsing CTE expressions
  • [HIVE-7080] - In PTest framework, Add logs URL to the JIRA comment
  • [HIVE-7082] - Vectorized parquet reader should create assigners only for the columns it assigns, not for scratch columns
  • [HIVE-7083] - Fix test failures on trunk
  • [HIVE-7087] - Remove lineage information after query completion
  • [HIVE-7092] - Insert overwrite should not delete the original directory
  • [HIVE-7096] - Support grouped splits in Tez partitioned broadcast join
  • [HIVE-7099] - Add Decimal datatype support for Windowing
  • [HIVE-7104] - Unit tests are disabled
  • [HIVE-7105] - Enable ReduceRecordProcessor to generate VectorizedRowBatches
  • [HIVE-7107] - Fix HiveServer1 JDBC Driver spec compliancy issue
  • [HIVE-7109] - Resource leak in HBaseStorageHandler
  • [HIVE-7112] - Tez processor swallows errors
  • [HIVE-7114] - Extra Tez session is started during HiveServer2 startup
  • [HIVE-7116] - HDFS FileSystem object cache causes permission issues in creating tmp directories
  • [HIVE-7117] - Partitions not inheriting table permissions after alter rename partition
  • [HIVE-7118] - Oracle upgrade schema scripts do not map Java long datatype columns correctly for transaction related tables
  • [HIVE-7119] - Extended ACL's should be inherited if warehouse perm inheritance enabled
  • [HIVE-7123] - Follow-up of HIVE-6367
  • [HIVE-7130] - schematool is broken for minor version upgrades (eg 0.13.x)
  • [HIVE-7131] - Dependencies of fetch task for tez are not shown properly
  • [HIVE-7135] - Fix test fail of TestTezTask.testSubmit
  • [HIVE-7143] - Add Streaming support in Windowing mode for more UDAFs (min/max, lead/lag, fval/lval)
  • [HIVE-7144] - GC pressure during ORC StringDictionary writes
  • [HIVE-7146] - posexplode() UDTF fails with a NullPointerException on NULL columns
  • [HIVE-7147] - ORC PPD should handle CHAR/VARCHAR types
  • [HIVE-7149] - Parquet not able to handle negative decimal numbers
  • [HIVE-7154] - TestMetrics fails intermittently on the trunk
  • [HIVE-7155] - WebHCat controller job exceeds container memory limit
  • [HIVE-7159] - For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition
  • [HIVE-7161] - TestMetastoreVersion fails intermittently on trunk
  • [HIVE-7162] - hadoop-1 build broken by HIVE-7071
  • [HIVE-7165] - Fix hive-default.xml.template errors & omissions
  • [HIVE-7167] - Hive Metastore fails to start with SQLServerException
  • [HIVE-7169] - HiveServer2 in Http Mode should have a configurable IdleMaxTime timeout
  • [HIVE-7170] - Fix display_colstats_tbllvl.q in trunk
  • [HIVE-7173] - Support HIVE-4867 on mapjoin of MR Tasks
  • [HIVE-7174] - Do not accept string as scale and precision when reading Avro schema
  • [HIVE-7176] - FileInputStream is not closed in Commands#properties()
  • [HIVE-7182] - ResultSet is not closed in JDBCStatsPublisher#init()
  • [HIVE-7183] - Size of partColumnGrants should be checked in ObjectStore#removeRole()
  • [HIVE-7187] - Reconcile jetty versions in hive
  • [HIVE-7188] - sum(if()) returns wrong results with vectorization
  • [HIVE-7190] - WebHCat launcher task failure can cause two concurent user jobs to run
  • [HIVE-7191] - optimized map join hash table has a bug when it reaches 2Gb
  • [HIVE-7192] - Hive Streaming - Some required settings are not mentioned in the documentation
  • [HIVE-7199] - Cannot alter table to parquet
  • [HIVE-7200] - Beeline output displays column heading even if --showHeader=false is set
  • [HIVE-7201] - Fix TestHiveConf#testConfProperties test case
  • [HIVE-7202] - DbTxnManager deadlocks in hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()
  • [HIVE-7209] - allow metastore authorization api calls to be restricted to certain invokers
  • [HIVE-7210] - NPE with "No plan file found" when running Driver instances on multiple threads
  • [HIVE-7213] - COUNT(*) returns out-dated count value after TRUNCATE
  • [HIVE-7220] - Empty dir in external table causes issue (root_dir_external_table.q failure)
  • [HIVE-7225] - Unclosed Statement's in TxnHandler
  • [HIVE-7226] - Windowing Streaming mode causes NPE for empty partitions
  • [HIVE-7228] - StreamPrinter should be joined to calling thread
  • [HIVE-7229] - String is compared using equal in HiveMetaStore#HMSHandler#init()
  • [HIVE-7232] - VectorReduceSink is emitting incorrect JOIN keys
  • [HIVE-7234] - Select on decimal column throws NPE
  • [HIVE-7235] - TABLESAMPLE on join table is regarded as alias
  • [HIVE-7236] - Tez progress monitor should indicate running/failed tasks
  • [HIVE-7237] - hive.exec.parallel=true w/ Hive 0.13/Tez causes application to linger forever
  • [HIVE-7241] - Wrong lock acquired for alter table rename partition
  • [HIVE-7242] - alter table drop partition is acquiring the wrong type of lock
  • [HIVE-7245] - Fix parquet_columnar
  • [HIVE-7246] - Hive transaction manager hardwires bonecp as the JDBC pooling implementation
  • [HIVE-7247] - Fix itests using hadoop-1 profile
  • [HIVE-7249] - HiveTxnManager.closeTxnManger() throws if called after commitTxn()
  • [HIVE-7251] - Fix StorageDescriptor usage in unit tests
  • [HIVE-7257] - UDF format_number() does not work on FLOAT types
  • [HIVE-7263] - Missing fixes from review of parquet-timestamp
  • [HIVE-7265] - BINARY columns use BytesWritable::getBytes() without ::getLength()
  • [HIVE-7268] - On Windows Hive jobs in Webhcat always run on default MR mode
  • [HIVE-7271] - Speed up unit tests
  • [HIVE-7274] - Update PTest2 to JClouds 1.7.3
  • [HIVE-7279] - UDF format_number() does not work on DECIMAL types
  • [HIVE-7281] - DbTxnManager acquiring wrong level of lock for dynamic partitioning
  • [HIVE-7282] - HCatLoader fail to load Orc map with null key
  • [HIVE-7287] - hive --rcfilecat command is broken on Windows
  • [HIVE-7294] - sql std auth - authorize show grant statements
  • [HIVE-7298] - desc database extended does not show properties of the database
  • [HIVE-7302] - Allow Auto-reducer parallelism to be turned off by a logical optimizer
  • [HIVE-7303] - IllegalMonitorStateException when stmtHandle is null in HiveStatement
  • [HIVE-7304] - Transitive Predicate Propagation doesn't happen properly after HIVE-7159
  • [HIVE-7314] - Wrong results of UDF when hive.cache.expr.evaluation is set
  • [HIVE-7317] - authorization_explain.q fails when run in sequence
  • [HIVE-7323] - Date type stats in ORC sometimes go stale
  • [HIVE-7325] - Support non-constant expressions for ARRAY/MAP type indices.
  • [HIVE-7326] - Hive complains invalid column reference with 'having' aggregate predicates
  • [HIVE-7339] - hive --orcfiledump command is not supported on Windows
  • [HIVE-7342] - support hiveserver2,metastore specific config files
  • [HIVE-7344] - Add streaming support in Windowing mode for FirstVal, LastVal
  • [HIVE-7345] - Beeline changes its prompt to reflect successful database connection even after failing to connect
  • [HIVE-7346] - Wrong results caused by hive ppd under specific join condition
  • [HIVE-7352] - Queries without tables fail under Tez
  • [HIVE-7353] - HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
  • [HIVE-7354] - windows:Need to set hbase jars in hadoop classpath explicitly
  • [HIVE-7356] - Table level stats collection fail for partitioned tables
  • [HIVE-7359] - Stats based compute query replies fail to do simple column transforms
  • [HIVE-7363] - VectorExpressionWriterDecimal is missing null check in setValue()
  • [HIVE-7366] - getDatabase using direct sql
  • [HIVE-7373] - Hive should not remove trailing zeros for decimal numbers
  • [HIVE-7374] - SHOW COMPACTIONS fail with remote metastore when there are no compations
  • [HIVE-7376] - add minimizeJar to jdbc/pom.xml
  • [HIVE-7385] - Optimize for empty relation scans
  • [HIVE-7389] - Reduce number of metastore calls in MoveTask (when loading dynamic partitions)
  • [HIVE-7393] - Tez jobs sometimes fail with NPE processing input splits
  • [HIVE-7394] - ORC writer logging fails when the padding is < 0.01
  • [HIVE-7396] - BucketingSortingReduceSinkOptimizer throws NullPointException during ETL
  • [HIVE-7397] - Set the default threshold for fetch task conversion to 1Gb
  • [HIVE-7399] - Timestamp type is not copied by ObjectInspectorUtils.copyToStandardObject
  • [HIVE-7403] - stats are not updated correctly after doing insert into table
  • [HIVE-7409] - Add workaround for a deadlock issue of Class.getAnnotation()
  • [HIVE-7412] - column stats collection throws exception if all values for a column is null
  • [HIVE-7414] - Update golden file for MiniTez temp_table.q
  • [HIVE-7415] - Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx failing
  • [HIVE-7417] - select count(1) from ... where true; fails in optimizer
  • [HIVE-7419] - Missing break in SemanticAnalyzer#getTableDescFromSerDe()
  • [HIVE-7421] - Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
  • [HIVE-7422] - Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble
  • [HIVE-7423] - produce hive-exec-core.jar from ql module
  • [HIVE-7424] - HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp
  • [HIVE-7426] - ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
  • [HIVE-7429] - Set replication for archive called before file exists
  • [HIVE-7433] - ColumnMappins.ColumnMapping should expose public accessors for its fields
  • [HIVE-7441] - Custom partition scheme gets rewritten with hive scheme upon concatenate
  • [HIVE-7450] - Database should inherit perms of warehouse dir
  • [HIVE-7451] - pass function name in create/drop function to authorization api
  • [HIVE-7452] - Boolean comparison is done through reference equality rather than using equals
  • [HIVE-7459] - Fix NPE when an empty file is included in a Hive query that uses CombineHiveInputFormat
  • [HIVE-7470] - Wrong Thrift declaration for {{ShowCompactResponseElement}}
  • [HIVE-7472] - CLONE - Import fails for tables created with default text, sequence and orc file formats using HCatalog API
  • [HIVE-7473] - Null values in DECIMAL columns cause serialization issues with HCatalog
  • [HIVE-7475] - Beeline requires newline at the end of each query in a file
  • [HIVE-7481] - The planning side changes for SMB join on hive-tez
  • [HIVE-7482] - The execution side changes for SMB join in hive-tez
  • [HIVE-7486] - Delete jar should close current classloader
  • [HIVE-7488] - pass column names being used for inputs to authorization api
  • [HIVE-7490] - Revert ORC stripe size
  • [HIVE-7494] - ORC returns empty rows for constant folded date queries
  • [HIVE-7508] - Kerberos support for streaming
  • [HIVE-7514] - Vectorization does not handle constant expression whose value is NULL
  • [HIVE-7521] - Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
  • [HIVE-7522] - Update .q.out for cluster_tasklog_retrieval.q test
  • [HIVE-7529] - load data query fails on hdfs federation + viewfs
  • [HIVE-7531] - auxpath parameter does not handle paths relative to current working directory.
  • [HIVE-7533] - sql std auth - set authorization privileges for tables when created from hive cli
  • [HIVE-7538] - Fix eclipse:eclipse after HIVE-7496
  • [HIVE-7539] - streaming windowing UDAF seems to be broken without Partition Spec
  • [HIVE-7553] - avoid the scheduling maintenance window for every jar change
  • [HIVE-7557] - When reduce is vectorized, dynpart_sort_opt_vectorization.q under Tez fails
  • [HIVE-7558] - HCatLoader reuses credentials across jobs
  • [HIVE-7563] - ClassLoader should be released from LogFactory
  • [HIVE-7574] - CommonJoinOperator.checkAndGenObject calls LOG.Trace per row from probe side in a HashMap join consuming 4% of the CPU
  • [HIVE-7576] - Add PartitionSpec support in HCatClient API
  • [HIVE-7579] - error message for 'drop admin role' in sql std auth mode is not informative
  • [HIVE-7583] - Use FileSystem.access() if available to check file access for user
  • [HIVE-7592] - List Jars or Files are not supported by Beeline
  • [HIVE-7595] - isKerberosMode() does a case sensitive comparison
  • [HIVE-7599] - NPE in MergeTask#main() when -format is absent
  • [HIVE-7600] - ConstantPropagateProcFactory uses reference equality on Boolean
  • [HIVE-7618] - TestDDLWithRemoteMetastoreSecondNamenode unit test failure
  • [HIVE-7620] - Hive metastore fails to start in secure mode due to "java.lang.NoSuchFieldError: SASL_PROPS" error
  • [HIVE-7623] - hive partition rename fails if filesystem cache is disabled
  • [HIVE-7629] - Problem in SMB Joins between two Parquet tables
  • [HIVE-7634] - Use Configuration.getPassword() if available to eliminate passwords from hive-site.xml
  • [HIVE-7635] - Query having same aggregate functions but different case throws IndexOutOfBoundsException
  • [HIVE-7637] - Change throws clause for Hadoop23Shims.ProxyFileSystem23.access()
  • [HIVE-7638] - Disallow CREATE VIEW when created with a temporary table
  • [HIVE-7645] - Hive CompactorMR job set NUM_BUCKETS mistake
  • [HIVE-7647] - Beeline does not honor --headerInterval and --color when executing with "-e"
  • [HIVE-7648] - authorization check api should provide table for create table,drop/create index, and db for create/switch db
  • [HIVE-7649] - Support column stats with temporary tables
  • [HIVE-7658] - Hive search order for hive-site.xml when using --config option
  • [HIVE-7666] - Join selectivity calculation should use exponential back-off for conjunction predicates
  • [HIVE-7667] - handle cast for long in get_aggr_stats() api for metastore for mysql
  • [HIVE-7669] - parallel order by clause on a string column fails with IOException: Split points are out of order
  • [HIVE-7673] - Authorization api: missing privilege objects in create table/view
  • [HIVE-7676] - JDBC: Support more DatabaseMetaData, ResultSetMetaData methods
  • [HIVE-7678] - add more test cases for tables qualified with database/schema name
  • [HIVE-7680] - Do not throw SQLException for HiveStatement getMoreResults and setEscapeProcessing(false)
  • [HIVE-7681] - qualified tablenames usage does not work with several alter-table commands
  • [HIVE-7682] - HadoopThriftAuthBridge20S should not reset configuration unless required
  • [HIVE-7683] - Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
  • [HIVE-7694] - SMB join on tables differing by number of sorted by columns with same join prefix fails
  • [HIVE-7695] - hive stats issue when insert query is appending data into table
  • [HIVE-7700] - authorization api - HivePrivilegeObject for permanent function should have database name set
  • [HIVE-7701] - Upgrading tez to 0.4.1 causes metadata only query to fail.
  • [HIVE-7704] - Create tez task for fast file merging
  • [HIVE-7710] - Rename table across database might fail
  • [HIVE-7712] - hive-exec-0.13.0.2.1.2.0-402.jar contains avro classes compiled against hadoop-v1
  • [HIVE-7722] - TestJdbcDriver2.testDatabaseMetaData fails after HIVE-7676
  • [HIVE-7730] - Extend ReadEntity to add accessed columns from query
  • [HIVE-7733] - Ambiguous column reference error on query
  • [HIVE-7738] - tez select sum(decimal) from union all of decimal and null throws NPE
  • [HIVE-7744] - In Windowing Streaming mode Avg and Sum give incorrect results when Wdw size is same as partition size
  • [HIVE-7753] - Same operand appears on both sides of > in DataType#compareByteArray()
  • [HIVE-7760] - Constants in VirtualColumn should be final
  • [HIVE-7764] - Support all JDBC-HiveServer2 authentication modes on a secure cluster
  • [HIVE-7769] - add --SORT_BEFORE_DIFF to union all .q tests
  • [HIVE-7770] - Undo backward-incompatible behaviour change introduced by HIVE-7341
  • [HIVE-7771] - ORC PPD fails for some decimal predicates
  • [HIVE-7774] - Issues with location path for temporary external tables
  • [HIVE-7777] - Add CSV Serde based on OpenCSV
  • [HIVE-7784] - Created the needed indexes on Hive.PART_COL_STATS for CBO
  • [HIVE-7786] - add --SORT_BEFORE_DIFF to union all tez .q.out files
  • [HIVE-7800] - Parquet Column Index Access Schema Size Checking
  • [HIVE-7807] - Refer to umask property using FsPermission.UMASK_LABEL.
  • [HIVE-7812] - Disable CombineHiveInputFormat when ACID format is used
  • [HIVE-7813] - Hive join key not null shouldn't be generated for partition column
  • [HIVE-7823] - HIVE-6185 removed Partition.getPartition
  • [HIVE-7824] - CLIServer.getOperationStatus eats ExceutionException
  • [HIVE-7828] - TestCLIDriver.parquet_join.q is failing on trunk
  • [HIVE-7829] - Entity.getLocation can throw an NPE
  • [HIVE-7834] - Use min, max and NDV from the stats to better estimate many to many vs one to many inner joins
  • [HIVE-7840] - Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s
  • [HIVE-7841] - Case, When, Lead, Lag UDF is missing annotation
  • [HIVE-7846] - authorization api should support group, not assume case insensitive role names
  • [HIVE-7851] - Fix NPE in split generation on Tez 0.5
  • [HIVE-7857] - Hive query fails after Tez session times out
  • [HIVE-7859] - Tune zlib compression in ORC to account for the encoding strategy
  • [HIVE-7863] - Potential null reference in TxnDbUtil#prepareDb()
  • [HIVE-7865] - Extend TestFileDump test case to printout ORC row index information
  • [HIVE-7878] - add -- SORT_BEFORE_DIFF to optimize_nullscan.q test
  • [HIVE-7883] - DBTxnManager trying to close already closed metastore client connection
  • [HIVE-7887] - VectorFileSinkOp does not publish the stats correctly
  • [HIVE-7889] - Query fails with char partition column
  • [HIVE-7890] - SessionState creates HMS Client while not impersonating
  • [HIVE-7891] - Table-creation fails through HCatClient for Oracle-based metastore.
  • [HIVE-7892] - Thrift Set type not working with Hive
  • [HIVE-7895] - Storage based authorization should consider sticky bit for drop actions
  • [HIVE-7897] - ObjectStore not using getPassword() for JDO connection string
  • [HIVE-7899] - txnMgr should be session specific
  • [HIVE-7901] - CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
  • [HIVE-7902] - Cleanup hbase-handler/pom.xml dependency list
  • [HIVE-7911] - Guaranteed ClassCastException in AccumuloRangeGenerator
  • [HIVE-7913] - Simplify filter predicates for CBO
  • [HIVE-7914] - Simplify join predicates for CBO to avoid cross products
  • [HIVE-7915] - Expose High and Low value in plan.ColStatistics
  • [HIVE-7919] - sql std auth: user with 'admin option' for role should be able to list all users in the role
  • [HIVE-7927] - Checking sticky bit needs shim
  • [HIVE-7936] - Support for handling Thrift Union types
  • [HIVE-7943] - hive.security.authorization.createtable.owner.grants is ineffective with Default Authorization
  • [HIVE-7944] - current update stats for columns of a partition of a table is not correct
  • [HIVE-7946] - CBO: Merge CBO changes to Trunk
  • [HIVE-7949] - Create table LIKE command doesn't set new owner
  • [HIVE-7950] - StorageHandler resources aren't added to Tez Session if already Session is already Open
  • [HIVE-7957] - Revisit event version handling in dynamic partition pruning on Tez
  • [HIVE-7971] - Support alter table change/replace/add columns for existing partitions
  • [HIVE-7972] - hiveserver2 specific configuration file is not getting used
  • [HIVE-7976] - Merge tez branch into trunk (tez 0.5.0)
  • [HIVE-7982] - Regression in explain with CBO enabled due to issuing query per K,V for the stats
  • [HIVE-7984] - AccumuloOutputFormat Configuration items from StorageHandler not re-set in Configuration in Tez
  • [HIVE-7985] - With CBO enabled cross product is generated when a subquery is present
  • [HIVE-7987] - Storage based authorization - NPE for drop view
  • [HIVE-7993] - With CBO enabled Q75 fails with RuntimeException: cannot find field _col69 from [0:_col18,...]
  • [HIVE-8008] - NPE while reading null decimal value
  • [HIVE-8012] - TestHiveServer2Concurrency is not implemented
  • [HIVE-8018] - Fix typo in config var name for dynamic partition pruning
  • [HIVE-8019] - Missing hive 0.13.1 commit in trunk : export/import statement authorization - CVE-2014-0228
  • [HIVE-8022] - Recursive root scratch directory creation is not using hdfs umask properly
  • [HIVE-8023] - Code in HIVE-6380 eats exceptions
  • [HIVE-8034] - Don't add colon when no port is specified
  • [HIVE-8041] - Hadoop-2 build is broken with JDK6
  • [HIVE-8044] - Container size and hash table size should be taken into account before deciding to do a MapJoin
  • [HIVE-8045] - SQL standard auth with cli - Errors and configuration issues
  • [HIVE-8047] - Lazy char/varchar are not using escape char defined in serde params
  • [HIVE-8051] - Some union queries fail with dynamic partition pruning on tez
  • [HIVE-8052] - Vectorization: min() on TimeStamp datatype fails with error "Vector aggregate not implemented: min for type: TIMESTAMP"
  • [HIVE-8056] - SessionState.dropSessionPaths should use FileSystem.getLocal(conf) to delete local files
  • [HIVE-8062] - Stats collection for columns fails on a partitioned table with null values in partitioning column
  • [HIVE-8071] - hive shell tries to write hive-exec.jar for each run
  • [HIVE-8078] - ORC Delta encoding corrupts data when delta overflows long
  • [HIVE-8081] - "drop index if exists" fails if table specified does not exist
  • [HIVE-8082] - generateErrorMessage doesn't handle null ast properly
  • [HIVE-8083] - Authorization DDLs should not enforce hive identifier syntax for user or group
  • [HIVE-8085] - stats optimizer should not use Description annotation to figure out function mapping (because FunctionRegistry doesn't)
  • [HIVE-8090] - Potential null pointer reference in WriterImpl#StreamFactory#createStream()
  • [HIVE-8092] - Vectorized Tez count(*) returns NULL instead of 0 when result is empty
  • [HIVE-8095] - Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
  • [HIVE-8102] - Partitions of type 'date' behave incorrectly with daylight saving time.
  • [HIVE-8103] - Read ACID tables with FetchOperator returns no rows
  • [HIVE-8104] - Insert statements against ACID tables NPE when vectorization is on
  • [HIVE-8105] - booleans and nulls not handled properly in insert/values
  • [HIVE-8107] - Bad error message for non-existent table in update and delete
  • [HIVE-8112] - Change reporting string to reflect update in Tez
  • [HIVE-8114] - Type resolution for udf arguments of Decimal Type results in error
  • [HIVE-8115] - Hive select query hang when fields contain map
  • [HIVE-8126] - Standalone hive-jdbc jar is not packaged in the Hive distribution
  • [HIVE-8138] - Global Init file should allow specifying file name not only directory
  • [HIVE-8139] - Upgrade commons-lang from 2.4 to 2.6
  • [HIVE-8142] - Add merge operators to queryplan.thrift instead of generated source file
  • [HIVE-8143] - Create root scratch dir with 733 instead of 777 perms
  • [HIVE-8146] - Test TestTempletonUtils.testFindContainingJar failing
  • [HIVE-8148] - HDFS Path named with file:// instead of file:/// results in Unit test failures in Windows
  • [HIVE-8149] - hive.optimize.reducededuplication should be set to false for IUD ops
  • [HIVE-8151] - Dynamic partition sort optimization inserts record wrongly to partition when used with GroupBy
  • [HIVE-8152] - Update with expression in set fails
  • [HIVE-8153] - Reduce the verbosity of debug logs in ORC record reader
  • [HIVE-8154] - HadoopThriftAuthBridge20S.getHadoopSaslProperties is incompatible with Hadoop 2.4.1 and later
  • [HIVE-8156] - Vectorized reducers need to avoid memory build-up during a single key
  • [HIVE-8158] - Optimize writeValue/setValue in VectorExpressionWriterFactory (in VectorReduceSinkOperator codepath)
  • [HIVE-8162] - Dynamic sort optimization propagates additional columns even in the absence of order by
  • [HIVE-8167] - mvn install command broken by HIVE-8126 commit
  • [HIVE-8169] - Windows: alter table ..set location from hcatalog failed with NullPointerException
  • [HIVE-8170] - Hive Metastore schema script missing for mssql for v0.14.0
  • [HIVE-8171] - Tez and Vectorized Reduce doesn't create scratch columns
  • [HIVE-8175] - Hive metastore upgrade from v0.13.0 to v0.14.0 script for Oracle is missing an upgrade step
  • [HIVE-8178] - OrcNewInputFormat::getSplits() calls OrcInputFormat.generateSplitsInfo twice
  • [HIVE-8179] - Fetch task conversion: Remove some dependencies on AST
  • [HIVE-8184] - inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery
  • [HIVE-8185] - hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build
  • [HIVE-8188] - ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop
  • [HIVE-8189] - A select statement with a subquery is failing with HBaseSerde
  • [HIVE-8191] - Update and delete on tables with non Acid output formats gives runtime error
  • [HIVE-8196] - Joining on partition columns with fetch column stats enabled results it very small CE which negatively affects query performance
  • [HIVE-8200] - Make beeline use the hive-jdbc standalone jar
  • [HIVE-8201] - Remove hardwiring to HiveInputFormat in acid qfile tests
  • [HIVE-8203] - ACID operations result in NPE when run through HS2
  • [HIVE-8205] - Using strings in group type fails in ParquetSerDe
  • [HIVE-8210] - TezJobMonitor should print time spent in Application (RUNNING)
  • [HIVE-8212] - Regression for hcat commandline alter view set tblproperties
  • [HIVE-8217] - WebHCat 'jobs' endpoint fails if it runs into issues with any of the jobs
  • [HIVE-8221] - authorize additional metadata read operations in metastore storage based authorization
  • [HIVE-8225] - CBO trunk merge: union11 test fails due to incorrect plan
  • [HIVE-8226] - Vectorize dynamic partitioning in VectorFileSinkOperator
  • [HIVE-8227] - NPE w/ hive on tez when doing unions on empty tables
  • [HIVE-8229] - Add multithreaded tests for the Hive Writable data types
  • [HIVE-8231] - Error when insert into empty table with ACID
  • [HIVE-8235] - Insert into partitioned bucketed sorted tables fails with "this file is already being created by"
  • [HIVE-8236] - VectorHashKeyWrapper allocates too many zero sized arrays
  • [HIVE-8239] - MSSQL upgrade schema scripts does not map Java long datatype columns correctly for transaction related tables
  • [HIVE-8240] - VectorColumnAssignFactory throws "Incompatible Bytes vector column and primitive category VARCHAR"
  • [HIVE-8246] - HiveServer2 in http-kerberos mode is restrictive on client usernames
  • [HIVE-8248] - TestHCatLoader.testReadDataPrimitiveTypes() occasionally fails
  • [HIVE-8250] - Truncating table doesnt invalidate stats
  • [HIVE-8257] - Accumulo introduces old hadoop-client dependency
  • [HIVE-8258] - Compactor cleaners can be starved on a busy table or partition.
  • [HIVE-8260] - CBO : Query query has date_dim d1,date_dim d2 and date_dim d3 but the explain has d1, d1 and d1
  • [HIVE-8261] - CBO : Predicate pushdown is removed by Optiq
  • [HIVE-8263] - CBO : TPC-DS Q64 is item is joined last with store_sales while it should be first as it is the most selective
  • [HIVE-8269] - Revert HIVE-8200 (Make beeline use the hive-jdbc standalone jar)
  • [HIVE-8270] - JDBC uber jar is missing some classes required in secure setup.
  • [HIVE-8271] - Jackson incompatibility between hadoop-2.4 and hive-14
  • [HIVE-8272] - Query with particular decimal expression causes NPE during execution initialization
  • [HIVE-8273] - Beeline doesn't print applicationID for submitted DAG
  • [HIVE-8277] - IP address string in HS2, metastore have a "/" prefix
  • [HIVE-8279] - sql std auth - additional test cases
  • [HIVE-8280] - CBO : When filter is applied on dimension table PK/FK code path is not in effect.
  • [HIVE-8281] - NPE with dynamic partition pruning on Tez
  • [HIVE-8283] - Missing break in FilterSelectivityEstimator#visitCall()
  • [HIVE-8287] - Metadata action errors don't have information about cause
  • [HIVE-8290] - With DbTxnManager configured, all ORC tables forced to be transactional
  • [HIVE-8292] - Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp
  • [HIVE-8296] - Tez ReduceShuffle Vectorization needs 2 data buffers (key and value) for adding rows
  • [HIVE-8298] - Incorrect results for n-way join when join expressions are not in same order across joins
  • [HIVE-8299] - HiveServer2 in http-kerberos & doAs=true is failing with org.apache.hadoop.security.AccessControlException
  • [HIVE-8304] - Tez Reduce-Side GROUP BY Vectorization doesn't copy NULL keys correctly
  • [HIVE-8310] - RetryingHMSHandler is not used when kerberos auth enabled
  • [HIVE-8311] - Driver is encoding transaction information too late
  • [HIVE-8313] - Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
  • [HIVE-8314] - Restore thrift string interning of HIVE-7975
  • [HIVE-8315] - CBO : Negate condition underestimates selectivity which results in an in-efficient plan
  • [HIVE-8316] - CBO : cardinality estimation for filters is much lower than actual row count
  • [HIVE-8318] - Null Scan optimizer throws exception when no partitions are selected
  • [HIVE-8321] - Fix serialization of TypeInfo for qualified types
  • [HIVE-8322] - VectorReduceSinkOperator: ClassCastException: ~StandardUnionObjectInspector$StandardUnion cannot be cast to ~IntWritable
  • [HIVE-8324] - Shim KerberosName (causes build failure on hadoop-1)
  • [HIVE-8328] - MapJoin implementation in Tez should not reload hashtables
  • [HIVE-8332] - Reading an ACID table with vectorization on results in NPE
  • [HIVE-8335] - TestHCatLoader/TestHCatStorer failures on pre-commit tests
  • [HIVE-8336] - Update pom, now that Optiq is renamed to Calcite
  • [HIVE-8340] - Windows: HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.
  • [HIVE-8341] - Transaction information in config file can grow excessively large
  • [HIVE-8344] - Hive on Tez sets mapreduce.framework.name to yarn-tez
  • [HIVE-8348] - Fix Hive to match changes introduced by TEZ-1510
  • [HIVE-8349] - DISTRIBUTE BY should work with tez auto-parallelism enabled
  • [HIVE-8354] - HIVE-7156 introduced required dependency on tez
  • [HIVE-8361] - NPE in PTFOperator when there are empty partitions
  • [HIVE-8363] - AccumuloStorageHandler compile failure hadoop-1
  • [HIVE-8364] - We're not waiting for all inputs in MapRecordProcessor on Tez
  • [HIVE-8366] - CBO fails if there is a table sample in subquery
  • [HIVE-8367] - delete writes records in wrong order in some cases
  • [HIVE-8368] - compactor is improperly writing delete records in base file
  • [HIVE-8372] - Potential NPE in Tez MergeFileRecordProcessor
  • [HIVE-8378] - NPE in TezTask due to null counters
  • [HIVE-8380] - NanoTime class serializes and deserializes Timestamp incorrectly
  • [HIVE-8382] - ConstantPropagateProcFactory#isDeterministicUdf adds a lot of ERROR level logs
  • [HIVE-8386] - HCAT api call is case sensitive on fields in struct column
  • [HIVE-8387] - add retry logic to ZooKeeperStorage in WebHCat
  • [HIVE-8389] - Fix CBO when indexes are used
  • [HIVE-8390] - CBO produces annoying exception message and wraps exceptions too much
  • [HIVE-8391] - Comparion between TIMESTAMP and Integer types goes to STRING as "common comparison denominator" instead of a numeric type
  • [HIVE-8392] - HiveServer2 Operation.close fails on windows
  • [HIVE-8393] - Handle SIGINT on Tez
  • [HIVE-8394] - HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
  • [HIVE-8399] - Build failure on trunk & 14 branch
  • [HIVE-8401] - OrcFileMergeOperator only close last orc file it opened, which resulted in stale data in table directory
  • [HIVE-8402] - Orc pushing SARGs into delta files causing ArrayOutOfBoundsExceptions
  • [HIVE-8403] - Build broken by datanucleus.org being offline
  • [HIVE-8404] - ColumnPruner doesnt prune columns from limit operator
  • [HIVE-8407] - [CBO] Handle filters with non-boolean return type
  • [HIVE-8408] - hcat cli throws NPE when authorizer using new api is enabled
  • [HIVE-8409] - SMB joins fail intermittently on tez
  • [HIVE-8411] - Support partial partition spec for certain ALTER PARTITION statements
  • [HIVE-8413] - [CBO] Handle ill-formed queries which have distinct, having in incorrect context
  • [HIVE-8415] - Vectorized comparison of timestamp and integer needs to treat integer as seconds since epoch
  • [HIVE-8417] - round(decimal, negative) errors out/wrong results with reduce side vectorization
  • [HIVE-8421] - [CBO] Use OptiqSemanticException in error conditions
  • [HIVE-8427] - Hive Streaming : secure streaming hangs leading to time outs.
  • [HIVE-8429] - Add records in/out counters
  • [HIVE-8433] - CBO loses a column during AST conversion
  • [HIVE-8434] - Vectorization logic using wrong values for DATE and TIMESTAMP partitioning columns in vectorized row batches...
  • [HIVE-8442] - Revert HIVE-8403
  • [HIVE-8443] - Disable tez_smb_1 for mapreduce and prevent from test hang
  • [HIVE-8444] - update pom to junit 4.11
  • [HIVE-8445] - TestColumnAccess, TestReadEntityDirect use same table names
  • [HIVE-8452] - Cleanup handling of resource configuration for tez
  • [HIVE-8460] - ORC SARG literal creation for double from float may lead to wrong evaluation of SARG
  • [HIVE-8461] - Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000
  • [HIVE-8462] - CBO duplicates columns
  • [HIVE-8464] - Vectorized reducer nested group by query returns wrong results
  • [HIVE-8474] - Vectorized reads of transactional tables fail when not all columns are selected
  • [HIVE-8475] - add test case for use of index from not-current database
  • [HIVE-8476] - JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with Kerberos
  • [HIVE-8478] - Vectorized Reduce-Side Group By doesn't handle Decimal type correctly
  • [HIVE-8479] - Tez sessions cannot change queues once assigned to one within a CLI session
  • [HIVE-8484] - HCatalog throws an exception if Pig job is of type 'fetch'
  • [HIVE-8489] - Add sanity check to dynamic partition pruning
  • [HIVE-8495] - Add progress bar for Hive on Tez queries
  • [HIVE-8497] - StatsNoJobTask doesn't close RecordReader, FSDataInputStream of which keeps open to prevent stale data clean
  • [HIVE-8498] - Insert into table misses some rows when vectorization is enabled
  • [HIVE-8510] - HIVE-8462 didn't update tez test output
  • [HIVE-8511] - fix build failure: cbo_correctness on tez
  • [HIVE-8514] - TestCliDriver.testCliDriver_index_in_db fails in trunk
  • [HIVE-8517] - When joining on partition column NDV gets overridden by StatsUtils.getColStatisticsFromExpression
  • [HIVE-8526] - Hive : CBO incorrect join order in TPC-DS Q45 as self join selectivity has incorrect CE
  • [HIVE-8534] - sql std auth : update configuration whitelist for 0.14
  • [HIVE-8543] - Compactions fail on metastore using postgres
  • [HIVE-8546] - Handle "add archive scripts.tar.gz" in Tez
  • [HIVE-8547] - CBO and/or constant propagation breaks partition_varchar2 test
  • [HIVE-8550] - Hive cannot load data into partitioned table with Unicode key
  • [HIVE-8551] - NPE in FunctionRegistry (affects CBO in negative tests)
  • [HIVE-8555] - Too many casts results in loss of original string representation for constant
  • [HIVE-8557] - automatically setup ZooKeeperTokenStore to use kerberos authentication when kerberos is enabled
  • [HIVE-8558] - CBO: enable n-way joins after CBO join reordering
  • [HIVE-8560] - SerDes that do not inherit AbstractSerDe do not get table properties during initialize()
  • [HIVE-8562] - ResultSet.isClosed sometimes doesn't work with mysql
  • [HIVE-8563] - Running annotate_stats_join_pkfk.q in TestMiniTezCliDriver is causing NPE
  • [HIVE-8566] - Vectorized queries output wrong timestamps
  • [HIVE-8567] - Vectorized queries output extra stuff for Binary columns
  • [HIVE-8575] - CBO: decimal_udf is broken by recent changes (and other tests have type changes)
  • [HIVE-8576] - Guaranteed NPE in StatsRulesProcFactory
  • [HIVE-8577] - Cannot deserialize Avro schema with a map with null values
  • [HIVE-8579] - Guaranteed NPE in DDLSemanticAnalyzer
  • [HIVE-8586] - Record counters aren't updated correctly for vectorized queries
  • [HIVE-8587] - Vectorized Extract operator needs to update the Vectorization Context column map
  • [HIVE-8588] - sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
  • [HIVE-8596] - HiveServer2 dynamic service discovery: ZK throws too many connections error
  • [HIVE-8603] - auto_sortmerge_join_5 is getting stuck on tez
  • [HIVE-8604] - Re-enable auto_sortmerge_join_5 on tez
  • [HIVE-8605] - HIVE-5799 breaks backward compatibility for time values in config
  • [HIVE-8614] - Upgrade hive to use tez version 0.5.2-SNAPSHOT
  • [HIVE-8615] - beeline csv,tsv outputformat needs backward compatibility mode
  • [HIVE-8619] - CBO causes some more type problems
  • [HIVE-8620] - CBO: HIVE-8433 RowResolver check is too stringent
  • [HIVE-8624] - Record counters don't work with Tez container reuse
  • [HIVE-8625] - Some union queries result in plans with many unions with CBO on
  • [HIVE-8628] - NPE in case of shuffle join in tez
  • [HIVE-8629] - Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez
  • [HIVE-8631] - Compressed transaction list cannot be parsed in job.xml
  • [HIVE-8632] - VectorKeyHashWrapper::duplicateTo allocates too many zero sized arrays
  • [HIVE-8634] - HiveServer2 fair scheduler queue mapping doesn't handle the secondary groups rules correctly
  • [HIVE-8635] - CBO: ambiguous_col negative test no longer fails
  • [HIVE-8641] - Disable skew joins in tez.
  • [HIVE-8643] - DDL operations via WebHCat with doAs parameter in secure cluster fail
  • [HIVE-8646] - Hive class loading failure when executing Hive action via oozie workflows
  • [HIVE-8647] - HIVE-8186 causes addition of same child operator multiple times
  • [HIVE-8660] - sql std auth: property missing from whitelist - hive.exec.dynamic.partition.mode
  • [HIVE-8663] - Fetching Vectorization scratch column map in Reduce-Side stop working
  • [HIVE-8664] - Use Apache Curator in JDBC Driver and HiveServer2 for better reliability
  • [HIVE-8665] - Fix misc unit tests on Windows
  • [HIVE-8668] - mssql sql script has carriage returns
  • [HIVE-8671] - Overflow in estimate row count and data size with fetch column stats
  • [HIVE-8675] - Increase thrift server protocol test coverage
  • [HIVE-8677] - TPC-DS Q51 : fails with "init not supported" exception in GenericUDAFStreamingEvaluator.init
  • [HIVE-8685] - DDL operations in WebHCat set proxy user to "null" in unsecure mode
  • [HIVE-8687] - Support Avro through HCatalog

New in Apache Hive 0.12.0 (Feb 5, 2015)

  • Sub-task:
  • [HIVE-2304] - Support PreparedStatement.setObject
  • [HIVE-4055] - add Date data type
  • [HIVE-4266] - Refactor HCatalog code to org.apache.hive.hcatalog
  • [HIVE-4324] - ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
  • [HIVE-4355] - HCatalog test TestPigHCatUtil might fail on JDK7
  • [HIVE-4460] - Publish HCatalog artifacts for Hadoop 2.x
  • [HIVE-4478] - In ORC, add boolean noNulls flag to column stripe metadata
  • [HIVE-4626] - join_vc.q is not deterministic
  • [HIVE-4646] - skewjoin.q is failing in hadoop2
  • [HIVE-4690] - stats_partscan_1.q makes different result with different hadhoop.mr.rev
  • [HIVE-4708] - Fix TestCliDriver.combine2.q on 0.23
  • [HIVE-4711] - Fix TestCliDriver.list_bucket_query_oneskew_{1,2,3}.q on 0.23
  • [HIVE-4712] - Fix TestCliDriver.truncate_* on 0.23
  • [HIVE-4713] - Fix TestCliDriver.skewjoin_union_remove_{1,2}.q on 0.23
  • [HIVE-4715] - Fix TestCliDriver.{recursive_dir.q,sample_islocalmode_hook.q,input12.q,input39.q,auto_join14.q} on 0.23
  • [HIVE-4717] - Fix non-deterministic TestCliDriver on 0.23
  • [HIVE-4721] - Fix TestCliDriver.ptf_npath.q on 0.23
  • [HIVE-4746] - Fix TestCliDriver.list_bucket_dml_{2,4,5,9,12,13}.q on 0.23
  • [HIVE-4750] - Fix TestCliDriver.list_bucket_dml_{6,7,8}.q on 0.23
  • [HIVE-4756] - Upgrade Hadoop 0.23 profile to 2.0.5-alpha
  • [HIVE-4761] - ZooKeeperHiveLockManage.unlockPrimitive has race condition with threads
  • [HIVE-4762] - HMS cannot handle concurrent requests
  • [HIVE-4763] - add support for thrift over http transport in HS2
  • [HIVE-4767] - ObjectStore.getPMF has concurrency problems
  • [HIVE-4871] - Apache builds fail with Target "make-pom" does not exist in the project "hcatalog".
  • [HIVE-4894] - Update maven coordinates of HCatalog artifacts
  • [HIVE-4895] - Move all HCatalog classes to org.apache.hive.hcatalog
  • [HIVE-4896] - create binary backwards compatibility layer hcatalog 0.12 and 0.11
  • [HIVE-4908] - rename templeton to webhcat?
  • [HIVE-4940] - udaf_percentile_approx.q is not deterministic
  • [HIVE-4980] - Fix the compiling error in TestHadoop20SAuthBridge
  • [HIVE-5013] - [HCatalog] Create hcat.py, hcat_server.py to make HCatalog work on Windows
  • [HIVE-5014] - [HCatalog] Fix HCatalog build issue on Windows
  • [HIVE-5015] - [HCatalog] Fix HCatalog unit tests on Windows
  • [HIVE-5028] - Some tests with fail OutOfMemoryError PermGen Space on Hadoop2
  • [HIVE-5035] - [WebHCat] Hardening parameters for Windows
  • [HIVE-5036] - [WebHCat] Add cmd script for WebHCat
  • [HIVE-5063] - Fix some non-deterministic or not-updated tests
  • [HIVE-5066] - [WebHCat] Other code fixes for Windows
  • [HIVE-5069] - Tests on list bucketing are failing again in hadoop2
  • [HIVE-5078] - [WebHCat] Fix e2e tests on Windows plus test cases for new features
  • [HIVE-5163] - refactor org.apache.hadoop.mapred.HCatMapRedUtil
  • [HIVE-5213] - remove hcatalog/shims directory
  • [HIVE-5233] - move hbase storage handler to org.apache.hcatalog package
  • [HIVE-5236] - Change HCatalog spacing from 4 spaces to 2
  • [HIVE-5260] - Introduce HivePassThroughOutputFormat that allows Hive to use general purpose OutputFormats instead of HiveOutputFormats in StorageHandlers
  • [HIVE-5261] - Make the Hive HBase storage handler work from HCatalog, and use HiveStorageHandlers instead of HCatStorageHandlers
  • Bug:
  • [HIVE-2015] - Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages
  • [HIVE-2379] - Hive/HBase integration could be improved
  • [HIVE-2473] - Hive throws an NPE when $HADOOP_HOME points to a tarball install directory that contains a build/ subdirectory.
  • [HIVE-2702] - Enhance listPartitionsByFilter to add support for integral types both for equality and non-equality
  • [HIVE-2905] - Desc table can't show non-ascii comments
  • [HIVE-3189] - cast ( as bigint) returning null values
  • [HIVE-3191] - timestamp - timestamp causes null pointer exception
  • [HIVE-3253] - ArrayIndexOutOfBounds exception for deeply nested structs
  • [HIVE-3256] - Update asm version in Hive
  • [HIVE-3264] - Add support for binary dataype to AvroSerde
  • [HIVE-3475] - INLINE UDTF doesn't convert types properly
  • [HIVE-3562] - Some limit can be pushed down to map stage
  • [HIVE-3588] - Get Hive to work with hbase 94
  • [HIVE-3632] - Upgrade datanucleus to support JDK7
  • [HIVE-3691] - TestDynamicSerDe failed with IBM JDK
  • [HIVE-3756] - "LOAD DATA" does not honor permission inheritence
  • [HIVE-3772] - Fix a concurrency bug in LazyBinaryUtils due to a static field
  • [HIVE-3810] - HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile
  • [HIVE-3846] - alter view rename NPEs with authorization on.
  • [HIVE-3891] - physical optimizer changes for auto sort-merge join
  • [HIVE-3926] - PPD on virtual column of partitioned table is not working
  • [HIVE-3953] - Reading of partitioned Avro data fails because of missing properties
  • [HIVE-3957] - Add pseudo-BNF grammar for RCFile to Javadoc
  • [HIVE-3978] - HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
  • [HIVE-4003] - NullPointerException in exec.Utilities
  • [HIVE-4051] - Hive's metastore suffers from 1+N queries when querying partitions & is slow
  • [HIVE-4057] - LazyHBaseRow may return cache data if the field is null and make the result wrong
  • [HIVE-4089] - javax.jdo : jdo2-api dependency not in Maven Central
  • [HIVE-4106] - SMB joins fail in multi-way joins
  • [HIVE-4171] - Current database in metastore.Hive is not consistent with SessionState
  • [HIVE-4181] - Star argument without table alias for UDTF is not working
  • [HIVE-4194] - JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
  • [HIVE-4214] - OVER accepts general expression instead of just function
  • [HIVE-4222] - Timestamp type constants cannot be deserialized in JDK 1.6 or less
  • [HIVE-4233] - The TGT gotten from class 'CLIService' should be renewed on time
  • [HIVE-4251] - Indices can't be built on tables whose schema info comes from SerDe
  • [HIVE-4290] - Build profiles: Partial builds for quicker dev
  • [HIVE-4295] - Lateral view makes invalid result if CP is disabled
  • [HIVE-4299] - exported metadata by HIVE-3068 cannot be imported because of wrong file name
  • [HIVE-4300] - ant thriftif generated code that is checkedin is not up-to-date
  • [HIVE-4322] - SkewedInfo in Metastore Thrift API cannot be deserialized in Python
  • [HIVE-4339] - build fails after branch (hcatalog version not updated)
  • [HIVE-4343] - HS2 with kerberos- local task for map join fails
  • [HIVE-4344] - CREATE VIEW fails when redundant casts are rewritten
  • [HIVE-4347] - Hcatalog build fail on Windows because javadoc command exceed length limit
  • [HIVE-4348] - Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
  • [HIVE-4350] - support AS keyword for table alias
  • [HIVE-4351] - Thrift code generation fails due to hcatalog
  • [HIVE-4364] - beeline always exits with 0 status, should exit with non-zero status on error
  • [HIVE-4369] - Many new failures on hadoop 2
  • [HIVE-4375] - Single sourced multi insert consists of native and non-native table mixed throws NPE
  • [HIVE-4377] - Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
  • [HIVE-4392] - Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
  • [HIVE-4403] - Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
  • [HIVE-4406] - Missing "/" or "/" in hs2 jdbc uri switches mode to embedded mode
  • [HIVE-4407] - TestHCatStorer.testStoreFuncAllSimpleTypes fails because of null case difference
  • [HIVE-4418] - TestNegativeCliDriver failure message if cmd succeeds is misleading
  • [HIVE-4421] - Improve memory usage by ORC dictionaries
  • [HIVE-4422] - Test output need to be updated for Windows only unit test in TestCliDriver
  • [HIVE-4424] - MetaStoreUtils.java.orig checked in mistakenly by HIVE-4409
  • [HIVE-4428] - Misspelling in describe extended output
  • [HIVE-4430] - Semantic analysis fails in presence of certain literals in on clause
  • [HIVE-4433] - Fix C++ Thrift bindings broken in HIVE-4322
  • [HIVE-4435] - Column stats: Distinct value estimator should use hash functions that are pairwise independent
  • [HIVE-4436] - hive.exec.parallel=true doesn't work on hadoop-2
  • [HIVE-4438] - Remove unused join configuration parameter: hive.mapjoin.size.key
  • [HIVE-4439] - Remove unused join configuration parameter: hive.mapjoin.cache.numrows
  • [HIVE-4440] - SMB Operator spills to disk like it's 1999
  • [HIVE-4441] - [HCatalog] WebHCat does not honor user home directory
  • [HIVE-4442] - [HCatalog] WebHCat should not override user.name parameter for Queue call
  • [HIVE-4465] - webhcat e2e tests succeed regardless of exitvalue
  • [HIVE-4466] - Fix continue.on.failure in unit tests to -well- continue on failure in unit tests
  • [HIVE-4471] - Build fails with hcatalog checkstyle error
  • [HIVE-4474] - Column access not tracked properly for partitioned tables
  • [HIVE-4475] - Switch RCFile default to LazyBinaryColumnarSerDe
  • [HIVE-4486] - FetchOperator slows down SMB map joins by 50% when there are many partitions
  • [HIVE-4487] - Hive does not set explicit permissions on hive.exec.scratchdir
  • [HIVE-4489] - beeline always return the same error message twice
  • [HIVE-4492] - Revert HIVE-4322
  • [HIVE-4496] - JDBC2 won't compile with JDK7
  • [HIVE-4497] - beeline module tests don't get run by default
  • [HIVE-4502] - NPE - subquery smb joins fails
  • [HIVE-4510] - HS2 doesn't nest exceptions properly (fun debug times)
  • [HIVE-4513] - disable hivehistory logs by default
  • [HIVE-4516] - Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
  • [HIVE-4521] - Auto join conversion fails in certain cases (empty tables, empty partitions, no partitions)
  • [HIVE-4525] - Support timestamps earlier than 1970 and later than 2038
  • [HIVE-4535] - hive build fails with hadoop 0.20
  • [HIVE-4540] - JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true
  • [HIVE-4542] - TestJdbcDriver2.testMetaDataGetSchemas fails because of unexpected database
  • [HIVE-4543] - Broken link in HCat 0.5 doc (Reader and Writer Interfaces)
  • [HIVE-4546] - Hive CLI leaves behind the per session resource directory on non-interactive invocation
  • [HIVE-4547] - A complex create view statement fails with new Antlr 3.4
  • [HIVE-4550] - local_mapred_error_cache fails on some hadoop versions
  • [HIVE-4554] - Failed to create a table from existing file if file path has spaces
  • [HIVE-4559] - hcatalog/webhcat scripts in tar.gz don't have execute permissions set
  • [HIVE-4562] - HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
  • [HIVE-4566] - NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established
  • [HIVE-4572] - ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap
  • [HIVE-4573] - Support alternate table types for HiveServer2
  • [HIVE-4578] - Changes to Pig's test harness broke HCat e2e tests
  • [HIVE-4580] - Change DDLTask to report errors using canonical error messages rather than http status codes
  • [HIVE-4581] - HCat e2e tests broken by changes to Hive's describe table formatting
  • [HIVE-4585] - Remove unused MR Temp file localization from Tasks
  • [HIVE-4586] - [HCatalog] WebHCat should return 404 error for undefined resource
  • [HIVE-4589] - Hive Load command failed when inpath contains space or any restricted characters
  • [HIVE-4591] - Making changes to webhcat-site.xml have no effect
  • [HIVE-4593] - ErrorMsg has several messages that reuse the same error code
  • [HIVE-4611] - SMB joins fail based on bigtable selection policy.
  • [HIVE-4615] - Invalid column names allowed when created dynamically by a SerDe
  • [HIVE-4618] - show create table creating unusable DDL when field delimiter is \001
  • [HIVE-4619] - Hive 0.11.0 is not working with pre-cdh3u6 and hadoop-0.23
  • [HIVE-4638] - Thread local PerfLog can get shared by multiple hiveserver2 sessions
  • [HIVE-4650] - Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x
  • [HIVE-4657] - HCatalog checkstyle violation after HIVE-2670
  • [HIVE-4677] - [HCatalog] WebHCat e2e tests fail on Hadoop 2
  • [HIVE-4679] - WebHCat can deadlock Hadoop if the number of concurrently running tasks if higher or equal than the number of mappers
  • [HIVE-4683] - fix coverage org.apache.hadoop.hive.cli
  • [HIVE-4689] - For outerjoins, joinEmitInterval might make wrong result
  • [HIVE-4691] - orc_createas1.q has minor inconsistency
  • [HIVE-4692] - Constant agg parameters will be replaced by ExprNodeColumnDesc with single-sourced multi-gby cases
  • [HIVE-4696] - WebHCat e2e test framework is missing files and instructions
  • [HIVE-4707] - Support configurable domain name for HiveServer2 LDAP authentication using Active Directory
  • [HIVE-4710] - ant maven-build -Dmvn.publish.repo=local fails
  • [HIVE-4724] - ORC readers should have a better error detection for non-ORC files
  • [HIVE-4730] - Join on more than 2^31 records on single reducer failed (wrong results)
  • [HIVE-4733] - HiveLockObjectData is not compared properly
  • [HIVE-4740] - HIVE-2379 is missing hbase.jar itself
  • [HIVE-4742] - A useless CAST makes Hive fail to create a VIEW based on an UNION
  • [HIVE-4748] - Fix TempletonUtilsTest failure on Windows
  • [HIVE-4757] - LazyTimestamp goes into irretrievable NULL mode once inited with NULL once
  • [HIVE-4781] - LEFT SEMI JOIN generates wrong results when the number of rows belonging to a single key of the right table exceed hive.join.emit.interval
  • [HIVE-4784] - ant testreport doesn't include any HCatalog tests
  • [HIVE-4785] - Implement isCaseSensitive for Hive JDBC driver
  • [HIVE-4789] - FetchOperator fails on partitioned Avro data
  • [HIVE-4798] - NPE when we call isSame from an instance of ExprNodeConstantDesc with null value
  • [HIVE-4802] - Fix url check for missing "/" or "/ after hostname in jdb uri
  • [HIVE-4804] - parallel order by fails for small datasets
  • [HIVE-4807] - Hive metastore hangs
  • [HIVE-4808] - WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly
  • [HIVE-4810] - Refactor exec package
  • [HIVE-4811] - (Slightly) break up the SemanticAnalyzer monstrosity
  • [HIVE-4812] - Logical explain plan
  • [HIVE-4814] - Adjust WebHCat e2e tests until HIVE-4703 is addressed
  • [HIVE-4818] - SequenceId in operator is not thread safe
  • [HIVE-4820] - webhcat_config.sh should set default values for HIVE_HOME and HCAT_PREFIX that work with default build tree structure
  • [HIVE-4829] - TestWebHCatE2e checkstyle violation causes all tests to fail
  • [HIVE-4830] - Test clientnegative/nested_complex_neg.q got broken due to 4580
  • [HIVE-4833] - Fix eclipse template classpath to include the correct jdo lib
  • [HIVE-4836] - make checkstyle ignore IntelliJ files and templeton e2e files
  • [HIVE-4838] - Refactor MapJoin HashMap code to improve testability and readability
  • [HIVE-4839] - build-common.xml has
  • [HIVE-4840] - Fix eclipse template classpath to include the BoneCP lib
  • [HIVE-4843] - Refactoring MapRedTask and ExecDriver for better re-usability (for tez) and readability
  • [HIVE-4845] - Correctness issue with MapJoins using the null safe operator
  • [HIVE-4852] - -Dbuild.profile=core fails
  • [HIVE-4853] - junit timeout needs to be updated
  • [HIVE-4854] - testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
  • [HIVE-4863] - Fix parallel order by on hadoop2
  • [HIVE-4865] - HiveLockObjects: Unlocking retries/times out when query contains ":"
  • [HIVE-4869] - Clean up HCatalog build post Hive integration
  • [HIVE-4870] - Explain Extended to show partition info for Fetch Task
  • [HIVE-4875] - hive config template is not parse-able due to angle brackets in description
  • [HIVE-4876] - Beeling help text do not contain -f and -e parameters
  • [HIVE-4878] - With Dynamic partitioning, some queries would scan default partition even if query is not using it.
  • [HIVE-4883] - TestHadoop20SAuthBridge tests fail sometimes because of race condition
  • [HIVE-4891] - Distinct includes duplicate records
  • [HIVE-4892] - PTest2 cleanup after merge
  • [HIVE-4893] - [WebHCat] HTTP 500 errors should be mapped to 400 for bad request
  • [HIVE-4899] - Hive returns non-meanful error message for ill-formed fs.default.name
  • [HIVE-4900] - Fix the mismatched column names in package.jdo
  • [HIVE-4915] - unit tests fail on windows because of difference in input file size
  • [HIVE-4927] - When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed
  • [HIVE-4928] - Date literals do not work properly in partition spec clause
  • [HIVE-4929] - the type of all numeric constants is changed to double in the plan
  • [HIVE-4930] - Classes of metastore should not be included MR-task
  • [HIVE-4932] - PTFOperator fails resetting PTFPersistence
  • [HIVE-4935] - Potential NPE in MetadataOnlyOptimizer
  • [HIVE-4942] - Fix eclipse template files to use correct datanucleus libs
  • [HIVE-4951] - combine2_win.q.out needs update for HIVE-3253 (increasing nesting levels)
  • [HIVE-4952] - When hive.join.emit.interval is small, queries optimized by Correlation Optimizer may generate wrong results
  • [HIVE-4955] - serde_user_properties.q.out needs to be updated
  • [HIVE-4962] - fix eclipse template broken by HIVE-3256
  • [HIVE-4964] - Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
  • [HIVE-4968] - When deduplicating multiple SelectOperators, we should update RowResolver accordinly
  • [HIVE-4970] - BinaryConverter does not respect nulls
  • [HIVE-4972] - update code generated by thrift for DemuxOperator and MuxOperator
  • [HIVE-4987] - Javadoc can generate argument list too long error
  • [HIVE-4990] - ORC seeks fails with non-zero offset or column projection
  • [HIVE-4991] - hive build with 0.20 is broken
  • [HIVE-4995] - select * may incorrectly return empty fields with hbase-handler
  • [HIVE-4998] - support jdbc documented table types in default configuration
  • [HIVE-5010] - HCatalog maven integration doesn't override mvn.local.repo in two locations
  • [HIVE-5011] - Dynamic partitioning in HCatalog broken on external tables
  • [HIVE-5012] - [HCatalog] Make HCatalog work on Windows
  • [HIVE-5017] - DBTokenStore gives compiler warnings
  • [HIVE-5023] - Hive get wrong result when partition has the same path but different schema or authority
  • [HIVE-5026] - HIVE-3926 is committed in the state of not rebased to trunk
  • [HIVE-5034] - [WebHCat] Make WebHCat work for Windows
  • [HIVE-5046] - Hcatalog's bin/hcat script doesn't respect HIVE_HOME
  • [HIVE-5047] - Hive client filters partitions incorrectly via pushdown in certain cases involving "or"
  • [HIVE-5048] - StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.
  • [HIVE-5049] - Create an ORC test case that has a 0.11 ORC file
  • [HIVE-5051] - StorageBasedAuthorizationProvider masks lower level exception with IllegalStateException
  • [HIVE-5055] - SessionState temp file gets created in history file directory
  • [HIVE-5056] - MapJoinProcessor ignores order of values in removing RS
  • [HIVE-5060] - JDBC driver assumes executeStatement is synchronous
  • [HIVE-5061] - Row sampling throws NPE when used in sub-query
  • [HIVE-5075] - bug in ExprProcFactory.genPruner
  • [HIVE-5079] - Make Hive compile under Windows
  • [HIVE-5084] - Fix newline.q on Windows
  • [HIVE-5085] - Hive Metatool errors out if HIVE_OPTS is set
  • [HIVE-5087] - Rename npath UDF to matchpath
  • [HIVE-5089] - Non query PreparedStatements are always failing on remote HiveServer2
  • [HIVE-5091] - ORC files should have an option to pad stripes to the HDFS block boundaries
  • [HIVE-5100] - RCFile::sync(long) missing 1 byte in System.arraycopy()
  • [HIVE-5104] - HCatStorer fails to store boolean type
  • [HIVE-5105] - HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap
  • [HIVE-5106] - HCatFieldSchema overrides equals() but not hashCode()
  • [HIVE-5120] - document what hive.server2.thrift.sasl.qop values mean in hive-default.xml.template
  • [HIVE-5122] - Add partition for multiple partition ignores locations for non-first partitions
  • [HIVE-5123] - group by on a same key producing wrong result
  • [HIVE-5127] - Upgrade xerces and xalan for WebHCat
  • [HIVE-5128] - Direct SQL for view is failing
  • [HIVE-5129] - Multiple table insert fails on count(distinct)
  • [HIVE-5131] - JDBC client's hive variables are not passed to HS2
  • [HIVE-5137] - A Hive SQL query should not return a ResultSet when the underlying plan does not include a FetchTask
  • [HIVE-5144] - HashTableSink allocates empty new Object[] arrays & OOMs - use a static emptyRow instead
  • [HIVE-5145] - Fix TestCliDriver.list_bucket_query_multiskew_2.q on hadoop 0.23
  • [HIVE-5149] - ReduceSinkDeDuplication can pick the wrong partitioning columns
  • [HIVE-5156] - HiveServer2 jdbc ResultSet.close should free up resources on server side
  • [HIVE-5161] - Additional SerDe support for varchar type
  • [HIVE-5167] - webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh
  • [HIVE-5196] - ThriftCLIService.java uses stderr to print the stack trace, it should use the logger instead.
  • [HIVE-5198] - WebHCat returns exitcode 143 (w/o an explanation)
  • [HIVE-5199] - Custom SerDe containing a nonSettable complex data type row object inspector throws cast exception with HIVE 0.11
  • [HIVE-5203] - FunctionRegistry.getMethodInternal() should prefer method arguments with closer affinity to the original argument types
  • [HIVE-5210] - WebHCatJTShim implementations are missing Apache license headers
  • [HIVE-5239] - LazyDate goes into irretrievable NULL mode once inited with NULL once
  • [HIVE-5241] - Default log4j log level for WebHCat should be INFO not DEBUG
  • [HIVE-5246] - Local task for map join submitted via oozie job fails on a secure HDFS
  • [HIVE-5255] - Missing metastore schema files for version 0.11
  • [HIVE-5265] - Direct SQL fallback broken on Postgres
  • [HIVE-5274] - HCatalog package renaming backward compatibility follow-up
  • [HIVE-5285] - Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
  • [HIVE-5292] - Join on decimal columns fails to return rows
  • [HIVE-5296] - Memory leak: OOM Error after multiple open/closed JDBC connections.
  • [HIVE-5297] - Hive does not honor type for partition columns
  • [HIVE-5301] - Add a schema tool for offline metastore schema upgrade
  • [HIVE-5322] - FsPermission is initialized incorrectly in HIVE 5513
  • [HIVE-5329] - Date and timestamp type converts invalid strings to '1970-01-01'
  • [HIVE-5337] - org.apache.hcatalog.common.HCatUtil is used by org.apache.hive.hcatalog.templeton.tool
  • [HIVE-5352] - cast('1.0' as int) returns null
  • [HIVE-5357] - ReduceSinkDeDuplication optimizer pick the wrong keys in pRS-cGBYm-cRS-cGBYr scenario when there are distinct keys in child GBY
  • [HIVE-5362] - TestHCatHBaseInputFormat has a bug which will not allow it to run on JDK7 and RHEL 6
  • [HIVE-5364] - NPE on some queries from partitioned orc table
  • [HIVE-5374] - hive-schema-0.13.0.postgres.sql doesn't work
  • [HIVE-5375] - Bug in Hive-0.12 branch with parameterized types due to merge conflict with HIVE-5199
  • [HIVE-5394] - ObjectInspectorConverters.getConvertedOI() does not return the correct object inspector for primitive type.
  • [HIVE-5401] - Array Out Of Bounds in OrcRecordReader
  • [HIVE-5402] - StorageBasedAuthorizationProvider is not correctly able to determine that it is running from client-side
  • [HIVE-5405] - Need to implement PersistenceDelegate for org.antlr.runtime.CommonToken
  • [HIVE-5410] - Hive command line option --auxpath still does not work post HIVE-5363
  • [HIVE-5413] - StorageDelegationAuthorizationProvider uses non-existent org.apache.hive.hcatalog.hbase.HBaseHCatStorageHandler
  • [HIVE-5416] - templeton/tests/jobsubmission2.conf erroneously removed
  • [HIVE-5419] - Fix schema tool issues with Oracle metastore
  • [HIVE-5426] - TestThriftBinaryCLIService tests fail on branch 0.12
  • [HIVE-5429] - HiveVarcharWritable length not reset when value is changed
  • [HIVE-5431] - PassthroughOutputFormat SH changes causes IllegalArgumentException
  • [HIVE-5433] - Fix varchar unit tests to work with hadoop-2.1.1
  • [HIVE-5476] - Authorization-provider tests fail in sequential run
  • [HIVE-5477] - maven-publish fails because it can't find hive-metastore-0.12.0.pom
  • [HIVE-5488] - some files are missing apache license headers
  • [HIVE-5489] - NOTICE copyright dates are out of date, README needs update
  • [HIVE-5493] - duplicate jars with different versions for guava, commons-logging
  • [HIVE-5497] - Hive trunk broken against hadoop 0.20.2
  • [HIVE-5769] - when "hive.server2.authentication" set "NONE", is "hive.server2.enable.doAs" always work?
  • [HIVE-5864] - Hive Table filter Not working (ERROR:SemanticException MetaException)
  • Improvement:
  • [HIVE-2084] - Upgrade datanucleus from 2.0.3 to a more recent version (3.?)
  • [HIVE-2608] - Do not require AS a,b,c part in LATERAL VIEW
  • [HIVE-2906] - Support providing some table properties by user via SQL
  • [HIVE-3603] - Enable client-side caching for scans on HBase
  • [HIVE-3725] - Add support for pulling HBase columns with prefixes
  • [HIVE-3764] - Support metastore version consistency check
  • [HIVE-3807] - Hive authorization should use short username when Kerberos authentication
  • [HIVE-4002] - Fetch task aggregation for simple group by query
  • [HIVE-4068] - Size of aggregation buffer which uses non-primitive type is not estimated correctly
  • [HIVE-4172] - JDBC2 does not support VOID type
  • [HIVE-4209] - Cache evaluation result of deterministic expression and reuse it
  • [HIVE-4228] - Bump up hadoop2 version in trunk
  • [HIVE-4241] - optimize hive.enforce.sorting and hive.enforce bucketing join
  • [HIVE-4268] - Beeline should support the -f option
  • [HIVE-4294] - Single sourced multi query cannot handle lateral view
  • [HIVE-4310] - optimize count(distinct) with hive.map.groupby.sorted
  • [HIVE-4393] - Make the deleteData flag accessable from DropTable/Partition events
  • [HIVE-4409] - Prevent incompatible column type changes
  • [HIVE-4423] - Improve RCFile::sync(long) 10x
  • [HIVE-4443] - [HCatalog] Have an option for GET queue to return all job information in single call
  • [HIVE-4444] - [HCatalog] WebHCat Hive should support equivalent parameters as Pig
  • [HIVE-4459] - Script hcat is overriding HIVE_CONF_DIR variable
  • [HIVE-4530] - Enforce minmum ant version required in build script
  • [HIVE-4549] - JDBC compliance change TABLE_SCHEMA to TABLE_SCHEM
  • [HIVE-4579] - Create a SARG interface for RecordReaders
  • [HIVE-4588] - Support session level hooks for HiveServer2
  • [HIVE-4601] - WebHCat needs to support proxy users
  • [HIVE-4609] - Allow hive tests to specify an alternative to /tmp
  • [HIVE-4610] - HCatalog checkstyle violation after HIVE-4578
  • [HIVE-4617] - Asynchronous execution in HiveServer2 to run a query in non-blocking mode
  • [HIVE-4620] - MR temp directory conflicts in case of parallel execution mode
  • [HIVE-4647] - RetryingHMSHandler logs too many error messages
  • [HIVE-4658] - Make KW_OUTER optional in outer joins
  • [HIVE-4675] - Create new parallel unit test environment
  • [HIVE-4682] - Temporary files are not closed in PTFPersistence on jvm reuse.
  • [HIVE-4737] - Allow access to MapredContext
  • [HIVE-4772] - Enable parallel execution of various E2E tests
  • [HIVE-4825] - Separate MapredWork into MapWork and ReduceWork
  • [HIVE-4827] - Merge a Map-only task to its child task
  • [HIVE-4858] - Sort "show grant" result to improve usability and testability
  • [HIVE-4873] - Sort candidate functions in case of UDFArgumentException
  • [HIVE-4874] - Identical methods PTFDeserializer.addOIPropertiestoSerDePropsMap(), PTFTranslator.addOIPropertiestoSerDePropsMap()
  • [HIVE-4877] - In ExecReducer, remove tag from the row which will be passed to the first Operator at the Reduce-side
  • [HIVE-4879] - Window functions that imply order can only be registered at compile time
  • [HIVE-4885] - Alternative object serialization for execution plan in hive testing
  • [HIVE-4913] - Put deterministic ordering in the top-K ngrams output of UDF context_ngrams()
  • [HIVE-4920] - PTest2 handle Spot Price increases gracefully and improve rsync paralllelsim
  • [HIVE-4948] - WriteLockTest and ZNodeNameTest do not follow test naming pattern
  • [HIVE-4954] - PTFTranslator hardcodes ranking functions
  • [HIVE-4960] - lastAlias in CommonJoinOperator is not used
  • [HIVE-4967] - Don't serialize unnecessary fields in query plan
  • [HIVE-4985] - refactor/clean up partition name pruning to be usable inside metastore server
  • [HIVE-4992] - add ability to skip javadoc during build
  • [HIVE-5006] - Re-factor HiveServer2 JDBC PreparedStatement to avoid duplicate code
  • [HIVE-5027] - Upgrade Ivy to 2.3
  • [HIVE-5031] - [WebHCat] GET job/:jobid to return userargs for a job in addtion to status information
  • [HIVE-5062] - Insert + orderby + limit does not need additional RS for limiting rows
  • [HIVE-5111] - ExprNodeColumnDesc doesn't distinguish partition and virtual columns, causing partition pruner to receive the latter
  • [HIVE-5121] - Remove obsolete code on SemanticAnalyzer#genJoinTree
  • [HIVE-5158] - allow getting all partitions for table to also use direct SQL path
  • [HIVE-5182] - log more stuff via PerfLogger
  • [HIVE-5206] - Support parameterized primitive types
  • [HIVE-5209] - JDBC support for varchar
  • [HIVE-5267] - Use array instead of Collections if possible in DemuxOperator
  • [HIVE-5278] - Move some string UDFs to GenericUDFs, for better varchar support
  • [HIVE-5363] - HIVE-3978 broke the command line option --auxpath
  • New Feature:
  • [HIVE-305] - Port Hadoop streaming's counters/status reporters to Hive Transforms
  • [HIVE-1402] - Add parallel ORDER BY to Hive
  • [HIVE-2206] - add a new optimizer for query correlation discovery and optimization
  • [HIVE-2482] - Convenience UDFs for binary data type
  • [HIVE-2517] - Support group by on struct type
  • [HIVE-2655] - Ability to define functions in HQL
  • [HIVE-2670] - A cluster test utility for Hive
  • [HIVE-3255] - Add DBTokenStore to store Delegation Tokens in DB
  • [HIVE-4005] - Column truncation
  • [HIVE-4095] - Add exchange partition in Hive
  • [HIVE-4123] - The RLE encoding for ORC can be improved
  • [HIVE-4246] - Implement predicate pushdown for ORC
  • [HIVE-4531] - [WebHCat] Collecting task logs to hdfs
  • [HIVE-4614] - Support outer lateral view
  • [HIVE-4844] - Add varchar data type
  • [HIVE-4911] - Enable QOP configuration for Hive Server 2 thrift transport
  • [HIVE-4963] - Support in memory PTF partitions
  • Task:
  • [HIVE-4331] - Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
  • [HIVE-4819] - Comments in CommonJoinOperator for aliasTag is not valid
  • [HIVE-4886] - beeline code should have apache license headers
  • [HIVE-4999] - Shim class HiveHarFileSystem does not have a hadoop2 counterpart
  • [HIVE-5059] - Meaningless warning message from TypeCheckProcFactory
  • [HIVE-5116] - HIVE-2608 didn't removed udtf_not_supported2.q test
  • [HIVE-5219] - Move VerifyingObjectStore into ql package
  • [HIVE-5313] - HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string)
  • Test:
  • [HIVE-4526] - auto_sortmerge_join_9.q throws NPE but test is succeeded
  • [HIVE-4636] - Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
  • [HIVE-4645] - Stat information like numFiles and totalSize is not correct when sub-directory is exists
  • [HIVE-4743] - Improve test coverage of package org.apache.hadoop.hive.ql.io
  • [HIVE-4779] - Enhance coverage of package org.apache.hadoop.hive.ql.udf
  • [HIVE-4791] - improve test coverage of package org.apache.hadoop.hive.ql.udf.xml
  • [HIVE-4796] - Increase coverage of package org.apache.hadoop.hive.common.metrics
  • [HIVE-4805] - Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors
  • [HIVE-4813] - Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr
  • [HIVE-5029] - direct SQL perf optimization cannot be tested well
  • [HIVE-5033] - Test result of ppd_vc.q is not udpated
  • [HIVE-5096] - Add q file tests for ORC predicate pushdown
  • [HIVE-5117] - orc_dictionary_threshold is not deterministic
  • [HIVE-5147] - Newly added test TestSessionHooks is failing on trunk
  • [HIVE-5197] - TestE2EScenerios.createTaskAttempt should use MapRedUtil

New in Apache Hive 0.11.0 (May 21, 2013)

  • Sub-task:
  • optimize orderby followed by a groupby
  • TypeInfoFactory is not thread safe and is access by multiple threads
  • InspectorFactories contains static HashMaps which can cause infinite loop
  • disable TestBeeLineDriver
  • disable TestBeeLineDriver in ptest util
  • Integrate HCatalog site into Hive site
  • Adjust build.xml package command to move all hcat jars and binaries into build
  • Move HCatalog trunk code from trunk/hcatalog/historical to trunk/hcatalog
  • HCatalog branches need to move out of trunk/hcatalog/historical
  • HCat needs to get current Hive jars instead of pulling them from maven repo
  • Merge HCat NOTICE file with Hive NOTICE file
  • Clean up remaining items in hive/hcatalog/historical/trunk
  • Bug:
  • Hive server is SHUTTING DOWN when invalid queries beeing executed.
  • If all of the parameters of distinct functions are exists in group by columns, query fails in runtime
  • ObjectInspectorConverters cannot convert Void types to Array/Map/Struct types.
  • should throw "Ambiguous column reference key" Exception in particular join condition
  • Aggregations without grouping should return NULL when applied to partitioning column of a partitionless table
  • Invalid tag is used for MapJoinProcessor
  • Filters on outer join with mapjoin hint is not applied correctly
  • Hive CI failing due to script_broken_pipe1.q
  • Comment indenting is broken for "describe" in CLI
  • HBase Handler doesn't handle NULLs properly
  • Hive compile errors under Java 7 (JDBC 4.1)
  • change hive.auto.convert.join's default value to true
  • LOAD DATA INPATH fails if a hdfs file with same name is added to table
  • Mixing avro and snappy gives null values
  • semi-colon in comments in .q file does not work
  • Result of outer join is not valid
  • HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification
  • user should not specify mapjoin to perform sort-merge bucketed join
  • Fix log4j configuration errors when running hive on hadoop23
  • PrimitiveObjectInspector doesn't handle timestamps properly
  • Merging join tree may reorder joins which could be invalid
  • Implement * or a.* for arguments to UDFs
  • Avro SerDe doesn't handle serializing Nullable types that require access to a Schema
  • release locks at the end of move tasks
  • NPE in union processing followed by lateral view followed by 2 group bys
  • When Group by Partition Column Type is Timestamp or STRING Which Format contains "HH:MM:SS", It will occur URISyntaxException
  • reflect udf cannot find method which has arguments of primitive types and String, Binary, Timestamp types mixed
  • script_pipe.q fails when using JDK7
  • RCFileWriter does not implement the right function to support Federation
  • HiveMetaStoreFsImpl is not compatible with hadoop viewfs
  • Allow URIs without port to be specified in metatool
  • External JAR files on HDFS can lead to race condition with hive.downloaded.resources.dir
  • enhanceModel.notRequired is incorrectly determined
  • Multiple insert overwrite into multiple tables query stores same results in all tables
  • Renaming table changes table location scheme/authority
  • Hive Query Explain Plan JSON not being created properly
  • Patch: Hive's ivy internal resolvers need to use sourceforge for sqlline
  • Hive won't compile with -Dhadoop.mr.rev=20S
  • make optimizing multi-group by configurable
  • Error in groupSetExpression rule in Hive grammar
  • PTest doesn't work due to hive snapshot version upgrade to 11
  • Driver.validateConfVariables() should perform more validations
  • Provide hive operation name for hookContext
  • JDBCStatsPublisher fails when ID length exceeds length of ID column
  • union_remove_9.q fails in trunk (hadoop 23)
  • TestNegativeMinimrCliDriver_mapreduce_stack_trace.q fails on hadoop-1
  • Enable adding hooks to hive meta store init
  • BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also
  • HIVE-3750 broke TestParse
  • Sort merge join should work if join cols are a prefix of sort columns for each partition
  • Unit test failures due to unspecified order of results in "show grant" command
  • Add MapJoinDesc.isBucketMapJoin() as part of explain plan
  • testCliDriver_sample_islocalmode_hook fails on hadoop-1
  • stats19.q is failing on trunk
  • Regression introduced from HIVE-3401
  • testCliDriver_repair fails on hadoop-1
  • Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
  • NPE in SELECT when WHERE-clause is an and/or/not operation involving null
  • testCliDriver_combine2 fails on hadoop-1
  • testCliDriver_loadpart_err fails on hadoop-1
  • testCliDriver_input39 fails on hadoop-1
  • explain dependency should show the dependencies hierarchically in presence of views
  • Ptest failing due to "Argument list too long" errors
  • Concurrency issue in RCFile: multiple threads can use the same decompressor
  • Adding the name space for the maven task for the maven-publish target.
  • Consider creating a literal like "D" or "BD" for representing Decimal type constants
  • bug if different serdes are used for different partitions
  • Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)
  • insert overwrite fails with stored-as-dir in cluster
  • Hive CLI needs UNSET TBLPROPERTY command
  • Insert overwrite doesn't create a dir if the skewed column position doesnt match
  • adding .gitattributes file for normalizing line endings during cross platform development
  • hive cli null representation in output is inconsistent
  • ppd.remove.duplicatefilters removing filters too aggressively
  • Aliased column in where clause for multi-groupby single reducer cannot be resolved
  • hour() function returns 12 hour clock value when using timestamp datatype
  • Multi-groupby optimization fails when same distinct column is used twice or more
  • Normalize left over CRLF files
  • Upgrade hbase dependency to 0.94
  • testHBaseNegativeCliDriver_cascade_dbdrop fails on hadoop-1
  • MAP JOIN for VIEW thorws NULL pointer exception error
  • lot of tests failing for hadoop 23
  • negative value for hive.stats.ndv.error should be disallowed
  • wrong mapside groupby if no partition is being selected
  • something wrong with the hive-default.xml
  • Partition pruning fails on = expression
  • create view statement's outputs contains the view and a temporary dir.
  • Wrong data due to HIVE-2820
  • table_access_keys_stats.q fails with hadoop 0.23
  • Possible deadlock in ZK lock manager
  • Union with map-only query on one side and two MR job query on the other produces wrong results
  • For outer joins, when looping over the rows looking for filtered tags, it doesn't report progress
  • Normalize more CRLF line endings
  • Change test for HIVE-2332
  • recursive_dir.q fails on 0.23
  • join_filters_overlap.q fails on 0.23
  • join_nullsafe.q fails on 0.23
  • Potential overflow with new RCFileCat column sizes options
  • Add Oracle metastore upgrade script for 0.9 to 10.0
  • Hive release tarballs don't contain PostgreSQL metastore scripts
  • Skewed query fails if hdfs path has special characters
  • MiniMR test remains pending after test completion
  • avro_nullable_fields.q is failing in trunk
  • Hive 0.10 postgres schema script is broken
  • Cleanup after HIVE-3403
  • Maintain a clear separation between Windowing & PTF at the specification level.
  • Update new UDAFs introduced for Windowing to work with new Decimal Type
  • Fix select expr processing in PTF Operator
  • Update PTF invocation and windowing grammar
  • Hive RCFile::sync(long) does a sub-sequence linear search for sync blocks
  • PostgreSQL upgrade scripts are not valid
  • Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0
  • Mysql metastore upgrade script will end up with different schema than the full schema load
  • Hive client goes into infinite loop at 100% cpu
  • Incorrect status for AddPartition metastore event if RawStore commit fails
  • MapJoin failing with Distributed Cache error
  • PostgreSQL upgrade scripts are creating column with incorrect name
  • Derby metastore update script will fail when upgrading from 0.9.0 to 0.10.0
  • Thrift alter_table api doesnt validate column type
  • Bring paranthesis handling in windowing specification in compliance with sql standard
  • Hive Profiler dies with NPE
  • Name windowing function in consistence with sql standard
  • NPE at runtime while selecting virtual column after joining three tables on different keys
  • Should be able to specify windowing spec without needing Between
  • Column Pruner for PTF Op
  • remove use of FunctionRegistry during PTF Op initialization
  • Hive compiler sometimes fails in semantic analysis / optimisation stage when boolean variable appears in WHERE clause.
  • fix ptf negative tests
  • Support multiple partitionings in a single Query
  • Disallow partition/sort and distribute/order combinations in windowing and partitioning spec
  • Extend rcfilecat to support (un)compressed size and no. of row
  • Followup to HIVE-701: reduce ambiguity in grammar
  • Map-join outer join produces incorrect results.
  • Hive eclipse build path update for string template jar
  • Make partition by optional in over clause
  • alterPartition and alterPartitions methods in ObjectStore swallow exceptions
  • Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
  • Altering a view partition fails with NPE
  • Add Lead & Lag UDAFs
  • allow expressions with over clause
  • Break up ptf tests in PTF, Windowing and Lead/Lag tests
  • PTF ColumnPruner doesn't account for Partition & Order expressions
  • Generated aliases for windowing expressions is broken
  • Use of hive.exec.script.allow.partial.consumption can produce partial results
  • Store complete names of tables in column access analyzer
  • Remove sprintf from PTFTranslator and use String.format()
  • decimal_3.q & decimal_serde.q fail on hadoop 2
  • problem in hive.map.groupby.sorted with distincts
  • ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids
  • OrcInputFormat assumes Hive always calls createValue
  • Remove System.gc() call from the map-join local-task loop
  • Hive localtask does not buffer disk-writes or reads
  • Hive MapJoinOperator unnecessarily deserializes values for all join-keys
  • Update Hive 0.10.0 RELEASE_NOTES.txt
  • Allow over() clause to contain an order by with no partition by
  • Partition by column does not have to be in order by
  • Default value in lag is not handled correctly
  • Window range specification should be more flexible
  • ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty
  • Queries fail if timestamp data not in expected format
  • remove support for lead/lag UDFs outside of UDAF args
  • Bring the Lead/Lag UDFs interface in line with Lead/Lag UDAFs
  • Fix eclipse template classpath to include new packages added by ORC file patch
  • ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
  • MiniDFS shim does not work for hadoop 2
  • Specifying alias for windowing function
  • Remove inferring partition specification behavior
  • Incorrect column mappings with over clause
  • bug with hive.auto.convert.join.noconditionaltask with outer joins
  • Cleanup aisle "ivy"
  • wrong results big outer joins with array of ints
  • HiveProfiler NPE with ScriptOperator
  • NPE reading column of empty string from ORC file
  • need to add protobuf classes to hive-exec.jar
  • RetryingHMSHandler doesn't retry in enough cases
  • Hive converts bucket map join to SMB join even when tables are not sorted
  • union_remove_*.q fail on hadoop 2
  • [REGRESSION] FsShell.close closes filesystem, removing temporary directories
  • Round UDF converts BigInts to double
  • ORC fails with files with different numbers of columns
  • NonBlockingOpDeDup does not merge SEL operators correctly
  • Filter getting dropped with PTFOperator
  • doAS does not work with HiveServer2 in non-kerberos mode with local job
  • Document HiveServer2 setup under the admin documentation on hive wiki
  • Document HiveServer2 JDBC and Beeline CLI in the user documentation
  • NPE in ReduceSinkDeDuplication
  • QL build-grammar target fails after HIVE-4148
  • TestJdbcDriver2.testDescribeTable failing consistently
  • ORC fails with String column that ends in lots of nulls
  • OVER clauses with ORDER BY not getting windowing set properly
  • describe table output always prints as if formatted keyword is specified
  • Bring windowing support inline with SQL Standard
  • reuse Partition objects in PTFOperator processing
  • Clientpositive test parenthesis_star_by is non-deteministic
  • Fix show_create_table_*.q test failures
  • explain dependency does not capture the input table
  • CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
  • hiveserver2 string representation of complex types are inconsistent with cli
  • Code cleanup : debug methods, having clause associated with Windowing
  • update show_functions.q.out for functions added for windowing
  • SEL operator created with missing columnExprMap for unions
  • union_remove_12, union_remove_13 are failing on hadoop2
  • union_remove_10 is failing on hadoop2 with assertion (root task with non-empty set of parents)
  • fix last_value UDAF behavior
  • fix handling of binary type in hiveserver2, jdbc driver
  • bug in hive.map.groupby.sorted in the presence of multiple input partitions
  • Limit precision of decimal type
  • partition wise metadata does not work for text files
  • Hive does not differentiate scheme and authority in file uris
  • TestRetryingHMSHandler is failing on trunk.
  • Add IntelliJ project files files to .gitignore
  • HCatalog build fails when behind a firewall
  • hiveserver2 should support -hiveconf commandline parameter
  • ant thriftif fails on hcatalog
  • Fix how RowSchema and RowResolver are set on ReduceSinkOp that precedes PTFOp
  • empty java files in hcatalog
  • Newly added test TestCliDriver.hiveprofiler_union0 is failing on trunk
  • DOS line endings in auto_join26.q
  • enable doAs in unsecure mode for hive server2, when MR job runs locally
  • OperatorHooks hit performance even when not used
  • Revert changes checked-in as part of HIVE-1953
  • Consider extending max limit for precision to 38
  • sqlline dependency is not required
  • NPE in constant folding with decimal
  • orc*.q tests fail on hadoop 2
  • most windowing tests fail on hadoop 2
  • ctas test on hadoop 2 has outdated golden file
  • serde_regex test fails on hadoop 2
  • Selecting from a view, and another view that also selects from that view fails
  • NPE for query involving UNION ALL with nested JOIN and UNION ALL
  • Guava not getting included in build package
  • remove duplicate impersonation parameters for hiveserver2
  • Check for Map side processing in PTFOp is no longer valid
  • wrong result in left semi join
  • some issue with merging join trees
  • Hive Version returned by HiveDatabaseMetaData.getDatabaseProductVersion is incorrect
  • Counters hit performance even when not used
  • ant maven-build fails because hcatalog doesn't have a make-pom target
  • test leadlag.q fails
  • HS2 Resource leak: operation handles not cleaned when originating session is closed
  • TestHCatStorer.testStoreFuncAllSimpleTypes fails because of null case difference
  • PTFDesc tries serialize transient fields like OIs, etc.
  • webhcat - support ${WEBHCAT_PREFIX}/conf/ as config directory
  • HCatalog unit tests stop after a failure
  • Improve memory usage by ORC dictionaries
  • hcatalog version numbers need to be updated
  • HCatalog build directories get included in tar file produced by "ant tar"
  • hcatalog jars not getting published to maven repo
  • ORC map columns get class cast exception in some context
  • TestBeeLineWithArgs.testPositiveScriptFile fails
  • HS2 holding too many file handles of hive_job_log_hive_*.txt files
  • Hive can't load transforms added using 'ADD FILE'
  • Fix eclipse project template
  • Improvement:
  • improve group by syntax
  • more query plan optimization rules
  • Hive should process comments in CliDriver
  • Upgrade antlr version to 3.4
  • Use name of original expression for name of CAST output
  • RegexSerDe should support other column types in addition to STRING
  • msck repair should find partitions already containing data files
  • Add environment context to metastore Thrift calls
  • Diversify grammar for split sampling
  • Avoid race conditions while downloading resources from non-local filesystem
  • Provide ALTER for partition changing bucket number
  • Allow CREATE TABLE LIKE command to take TBLPROPERTIES
  • Simple lock manager for dedicated hive server
  • hivetest.py: revision number and applied patch
  • Provide a way to use counters in Hive through UDF
  • sort-merge join does not work with sub-queries
  • Support altering partition column type in Hive
  • Add mapreduce workflow information to job configuration
  • Stop storing default ConfVars in temp file
  • HiveConf.ConfVars.HIVE_STATS_COLLECT_RAWDATASIZE should not be checked in FileSinkOperator
  • Minor fix for 'tableName' in Hive.g
  • de-emphasize mapjoin hint
  • Print number of fetched rows after query in CliDriver
  • Multi-insert involving bucketed/sorted table turns off merging on all outputs
  • Better error message if metalisteners or hookContext cannot be loaded/instantiated
  • Resolve TODO in TUGIBasedProcessor
  • object inspectors should be initialized based on partition metadata
  • UDF unix_timestamp is deterministic if an argument is given, but it treated as non-deterministic preventing PPD
  • Create a new Optimized Row Columnar file format for Hive
  • Better align columns in DESCRIBE table_name output to make more human-readable
  • Replace hashmaps in JoinOperators to array
  • Support noscan operation for analyze command
  • Remove code for merging files via MR job
  • merge map-job followed by map-reduce job
  • support partial scan for analyze command - RCFile
  • Clean up/fix PartitionNameWhitelistPreEventListener
  • Correctly enforce the memory limit on the multi-table map-join
  • Add o.a.h.h.serde.Constants for backward compatibility
  • Create abstract classes for serializer and deserializer
  • Add ORC file to the grammar as a file format
  • Remove init(fname) from TestParse.vm for each test
  • Swap applying order of CP and PPD
  • Improve Error Logging in MetaStore
  • Add reflect UDF for member method invocation of column
  • ignore mapjoin hint
  • Modify PreDropPartitionEvent to pass Table parameter
  • Refactor code for finding windowing expressions
  • Expose metastore JMX metrics
  • Support avg(decimal)
  • Window handling dumps debug info on console, instead should use logger.
  • ORC runs out of heap when writing
  • Sort merge join does not work for outer joins for 7 inputs
  • sort merge join should work for outer joins for more than 8 inputs
  • optimize hive.enforce.bucketing and hive.enforce sorting insert
  • Log logical plan tree for debugging
  • add hive.map.groupby.sorted.testmode
  • Remove unused builtins and pdk submodules
  • PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator
  • Change default bigtable selection policy for sort-merge joins
  • New Feature:
  • Implement TRUNCATE
  • lots of reserved keywords in hive
  • Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
  • Infer bucketing/sorting properties
  • Adding the oracle nvl function to the UDF
  • Specify location of log4j configuration files via configuration properties
  • Add DECIMAL data type
  • Implement HiveServer2
  • Hive List Bucketing - DML support
  • HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys
  • Add 'IGNORE PROTECTION' predicate for dropping partitions
  • when output hive table to file,users should could have a separator of their own choice
  • Add Operator level Hooks
  • Support ALTER VIEW AS SELECT in Hive
  • Add a way to get the uncompressed/compressed sizes of columns from an RC File
  • getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers
  • Allow updating bucketing/sorting metadata of a partition through the CLI
  • Hive Profiler
  • Allow Decimal type columns in Regex Serde
  • Ability to create and drop temporary partition function
  • Allow partition by/order by in partitioning spec in over clause and partition function
  • Implement decimal encoding for ORC
  • Testing with Hadoop 2.x causes test failure for ORC's TestFileDump
  • Expose ORC's FileDump as a service
  • Implement a memory manager for ORC
  • Task:
  • Unescape partition names returned by show partitions
  • Add check to determine whether partition can be dropped at Semantic Analysis time
  • ALTER TABLE ADD PARTS should check for valid partition spec and throw a SemanticException if part spec is not valid
  • Add input table name to MetaStoreEndFunctionContext for logging purposes
  • Track columns accessed in each table in a query
  • Split up tests in ptf_general_queries.q
  • Merge PTFDesc and PTFDef classes
  • Add apache headers in new files
  • Create hcatalog stub directory and add it to the build
  • Test:
  • add a way to run a small unit quickly
  • Remove redundant test codes
  • Make accept qfile argument for miniMR tests
  • TestMetaStoreAuthorization always uses the same port
  • Add more tests for windowing
  • add tests for distincts for hive.map.groutp.sorted
  • Update list bucketing test results
  • Wish:
  • Result of mapjoin_test_outer.q is not deterministic

New in Apache Hive 0.10.0 (May 21, 2013)

  • Sub-task:
  • Optimizer statistics on columns in tables and partitions
  • Support external hive tables whose data are stored in Azure blob store/Azure Storage Volumes (ASV)
  • Remove the duplicate JAR entries from the (“test.classpath”) to avoid command line exceeding char limit on windows
  • Windows: Fix the unit tests which contains “!” commands (Unix shell commands)
  • FileUtils.tar does not close input files
  • Fix “TestDosToUnix” unit tests on Windows by closing the leaking file handle in DosToUnix.java.
  • Fix the “TestHiveHistory”, “TestHiveConf”, & “TestExecDriver” unit tests on Windows by fixing the path related issues.
  • Handle “CRLF” line endings to avoid the extra spacing in generated test outputs in Windows. (Utilities.Java :: readColumn)
  • Remove the Unix specific absolute path of “Cat” utility in several .q files to make them run on Windows with CygWin in path.
  • PartitionPruner should log why it is not pushing the filter down to JDO
  • Bug:
  • cluster by multiple columns does not work if parenthesis is present
  • Nested UDAFs cause Hive Internal Error (NullPointerException)
  • DESCRIBE TABLE syntax doesn't support specifying a database qualified table name
  • mapjoin sometimes gives wrong results if there is a filter in the on condition
  • java.io.IOException: error=7, Argument list too long
  • Group by operator does not estimate size of Timestamp & Binary data correctly
  • LATERAL VIEW with EXPLODE produces ConcurrentModificationException
  • DROP DATABASE CASCADE does not drop non-native tables.
  • Nullpointer on registering udfs.
  • Hive Ivy dependencies on Hadoop should depend on jars directly, not tarballs
  • Make the header of RCFile unique
  • Upgrade Thrift dependency to 0.9.0
  • ability to select a view qualified by the database / schema name
  • Reduce Sink deduplication fails if the child reduce sink is followed by a join
  • Hive UDFs cannot emit binary constants
  • hive can't find hadoop executor scripts without HADOOP_HOME set
  • When integrating into MapReduce2, Hive is unable to handle corrupt rcfile archive
  • query_properties.q contains non-deterministic queries
  • NPE in "create index" without comment clause in external metastore
  • utc_from_timestamp and utc_to_timestamp returns incorrect results.
  • Task log retrieval fails on Hadoop 0.23
  • TestNegativeCliDriver autolocal1.q fails on 0.23
  • Renaming external partition changes location
  • ant gen-test failed
  • Hive error when dropping a table with large number of partitions
  • Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI
  • race condition in DAG execute tasks for hive
  • analyze command throw NPE when table doesn't exists
  • Hive should expand nested structs when setting the table schema from thrift structs
  • substr on string containing UTF-8 characters produces StringIndexOutOfBoundsException
  • Queries consists of metadata-only-query returns always empty value
  • Hive JDBC doesn't support TIMESTAMP column
  • metastore delegation token is not getting used by hive commandline
  • GET_JSON_OBJECT fails on some valid JSON keys
  • Filter parsing does not recognize '!=' as operator and silently ignores invalid tokens
  • Fix maven-build Ant target
  • Fix test failure in TestNegativeCliDriver.dyn_part_max caused by HIVE-2918
  • Remove hadoop-source Ivy resolvers and Ant targets
  • Offline build is not working
  • Potential infinite loop / log spew in ZookeeperHiveLockManager
  • Memory leak in TUGIContainingTransport
  • TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly
  • Fix metastore test failures caused by HIVE-2757
  • Add JUnit to list of test dependencies managed by Ivy
  • Tests failing for me
  • Fix javadoc again
  • Update ShimLoader to work with Hadoop 2.x
  • escape more chars for script operator
  • hive docs target does not work
  • Modify clean target to remove ~/.ivy2/local/org.apache.hive ~/.ivy2/cache/org.apache.hive
  • Partition column values are not valid if any of virtual columns is selected
  • setup classpath for templates correctly for eclipse
  • TestHadoop20SAuthBridge always uses the same port
  • metastore.HiveMetaStore$HMSHandler should set the thread local raw store to null in shutdown()
  • hive.transform.escape.input breaks tab delimited data
  • revert HIVE-2703
  • Insert into table overwrites existing table if table name contains uppercase character
  • drop partition for non-string columns is failing
  • Drop partition problem
  • Filter on outer join condition removed while merging join tree
  • drop partition does not work for non-partition columns
  • Revert HIVE-2989
  • ROFL Moment. Numberator and denaminator typos
  • Oracle Metastore schema script doesn't include DDL for DN internal tables
  • make parallel tests work
  • Timestamp type values not having nano-second part breaks row
  • Hive tests should load Hive classes from build directory, not Ivy cache
  • Memory leak from large number of FileSystem instances in FileSystem.CACHE
  • Add HiveCLI that runs over JDBC
  • dropTable will all ways excute hook.rollbackDropTable whether drop table success or faild.
  • clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
  • make copyLocal work for parallel tests
  • Hadoop20Shim. CombineFileRecordReader does not report progress within files
  • Error in Removing ProtectMode from a Table
  • sort_array doesn't work with LazyPrimitive
  • Generate & build the velocity based Hive tests on windows by fixing the path issues
  • Pass hconf values as XML instead of command line arguments to child JVM
  • use commons-compress instead of forking tar process
  • Drop table/index/database can result in orphaned locations
  • add an option in ptest to run on a single machine
  • Comment indenting is broken for "describe" in CLI
  • Bug in parallel test for singlehost flag
  • Dynamically generated paritions deleted by Block level merge
  • drop the temporary function at end of autogen_colalias.q
  • Fix non-deterministic testcases failures when running Hive0.9.0 on MapReduce2
  • Hive thrift code doesnt generate quality hashCode()
  • LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of underlying BytesWritable
  • Bucketed sort merge join doesn't work when multiple files exist for small alias
  • retry not honored in RetryingRawMetastore
  • Fix Eclipse classpath template broken in HIVE-3128
  • Drop partition throws NPE if table doesn't exist
  • Bucketed mapjoin on partitioned table which has no partition throws NPE
  • FileUtils.tar assumes wrong directory in some cases
  • JobDebugger should use RunningJob.getTrackingURL
  • Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly
  • HiveConf.getPositionFromInternalName does not support more than sinle digit column numbers
  • NPE on a join query with authorization enabled
  • ColumnPruner is not working on LateralView
  • Make logging of plan progress in HadoopJobExecHelper configurable
  • Resource Leak: Fix the File handle leak in EximUtil.java
  • Fix non-deterministic results in newline.q and timestamp_lazy.q
  • Fix cascade_dbdrop.q when building hive on hadoop0.23
  • ignore white space between entries of hive/hbase table mapping
  • java primitive type for binary datatype should be byte[]
  • Sorted by order of table not respected
  • lack of semi-colon in .q file leads to missing the next statement
  • Upgrade guava to 11.0.2
  • Hive doesn't remove scrach directories while killing running MR job
  • Fix avro_joins.q testcase failure when building hive on hadoop0.23
  • alter the number of buckets for a non-empty partitioned table should not be allowed
  • bucketed mapjoin silently ignores mapjoin hint
  • HiveHistory.printRowCount() throws NPE
  • escaped columns in cluster/distribute/order/sort by are not working
  • expressions in cluster by are not working
  • Add avro jars into hive execution classpath
  • Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
  • optimize union sub-queries
  • Table schema not being copied to Partitions with no columns
  • Convert runtime exceptions to semantic exceptions for missing partitions/tables in show/describe statements
  • bucket information should be used from the partition instead of the table
  • sort merge join may not work silently
  • fix fs resolvers
  • Load file into a table does not update table statistics
  • HIVE-3128 introduced bug causing dynamic partitioning to fail
  • Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23
  • Race condition in query plan for merging at the end of a query
  • Fix error code inconsistency bug in mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q when running hive on hadoop23
  • SMBJoin/BucketMapJoin should be allowed only when join key expression is exactly matches with sort/cluster key
  • [Regression] TestMTQueries test is failing on trunk
  • Convert runtime exceptions to semantic exceptions for validation of alter table commands
  • Archives broken for hadoop 1.0
  • Change the rules in SemanticAnalyzer to use Operator.getName() instead of hardcoded names
  • shims unit test failures fails further test progress
  • Making hive tests run against different MR versions
  • Hive: Query misaligned result for Group by followed by Join with filter and skip a group-by result
  • Add junit exclude utility to disable testcases
  • Upgrade Hive's Avro dependency to version 1.7
  • bucketed map join should check that the number of files match the number of buckets
  • stats are not being collected correctly for analyze table with dynamic partitions
  • fpair on creating external table
  • Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key
  • GenMRSkewJoinProcessor uses File.Separator instead of Path.Separator
  • map-reduce jobs does not work for a partition containing sub-directories
  • Missing column causes null pointer exception
  • Parallel test script doesnt run all tests
  • Dynamic partition queries producing no partitions fail with hive.stats.reliable=true
  • hive unit tests fail to get lock using zookeeper on windows
  • insert into statement overwrites if target table is prefixed with database name
  • Duplicate data possible with speculative execution for dynamic partitions
  • Remove the specialized logic to handle the file schemas in windows vs unix from build.xml
  • Bug fix: Return the child JVM exit code to the parent process to handle the error conditions
  • : Fix the file handle leaks in Symbolic & Symlink related input formats.
  • : Hiveserver is not closing the existing driver handle before executing the next command. It results in to file handle leaks.
  • joins using partitioned table give incorrect results on windows
  • RetryingRawStore logic needs to be significantly reworked to support retries within transactions
  • Hive List Bucketing - Skewed DDL doesn't support skewed value with string quote
  • CTAS in database with location on non-default name node fails
  • Some of the Metastore unit tests failing on Windows because of the static variables initialization problem in HiveConf class.
  • aggName of SemanticAnalyzer.getGenericUDAFEvaluator is generated in two different ways
  • Some of the JDBC test cases are failing on Windows because of the longer class path.
  • For UDAFs, when generating a plan without map-side-aggregation, constant agg parameters will be replaced by ExprNodeColumnDesc
  • Query plan for multi-join where the third table joined is a subquery containing a map-only union with hive.auto.convert.join=true is wrong
  • Avoid NPE in skewed information read
  • hivetest.py fails with --revision option
  • log4j template has logging threshold that hides all audit logs
  • Some of the tests are not deterministic
  • metadata_export_drop.q causes failure of other tests
  • QTestUtil side-effects
  • partition to directory comparison in CombineHiveInputFormat needs to accept partitions dir without scheme
  • ivysettings.xml does not let you override .m2/repository
  • Make separator for Entity name configurable
  • Hive info logging is broken
  • Avro Maps with Nullable Values fail with NPE
  • Incorrect partition bucket/sort metadata when overwriting partition with different metadata from table
  • ZooKeeperHiveLockManager does not respect the option to keep locks alive even after the current session has closed
  • derby metastore upgrade script throw errors when updating from 0.7 to 0.8
  • Output of sort merge join is no longer bucketed
  • union involving double column with a map join subquery will fail or give wrong results
  • Test "Path -> Alias" for explain extended
  • Hive always prints a warning message when using remote metastore
  • Drop database cascade fails when there are indexes on any tables
  • get_json_object and json_tuple return null in the presence of new line characters
  • Regression - HiveConf static variable causes issues in long running JVM insname of some metastore scripts are not per convention
  • Use varbinary instead of longvarbinary to store min and max column values in column stats schema
  • Metastore: Sporadic unit test failures
  • Create index fails on CLI using remote metastore
  • Hive Driver leaks ZooKeeper connections
  • Metastore tests use hardcoded ports
  • Error in groupSetExpression rule in Hive grammar
  • Multiple aggregates in query fail the job
  • PTest doesn't work due to hive snapshot version upgrade to 11
  • hive unit test case build failure.
  • The derby metastore schema script for 0.10.0 doesn't run
  • Must publish new Hive-0.10 artifacts to apache repository.
  • RetryingMetaStoreClient Should Log the Caught Exception
  • hive pom file has missing conf and scope mapping for compile configuration.
  • Oracle upgrade script for Hive is broken
  • Cannot drop partitions on table when using Oracle metastore
  • Hive JIRA still shows 0.10 as unreleased in "Affects Version/s" dropdown
  • HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
  • TestCase TestMTQueries fails with Non-Sun Java
  • Doc update for .8, .9 and .10
  • closeAllForUGI causes failure in hiveserver2 when fetching large amount of data
  • Improvement:
  • Ability to enforce correct stats
  • Add a configuration property that sets the variable substitution max depth
  • metastore 0.8 upgrade script for PostgreSQL
  • Collapse hive.metastore.uris and hive.metastore.local
  • Support auto completion for hive configs in CliDriver
  • Add validation to HiveConf ConfVars
  • Improve the HWI interface
  • Move global .hiverc file
  • Support non-MR fetching for simple queries with select/limit/filter operations only
  • [hive] Provide error message when using UDAF in the place of UDF instead of throwing NPE
  • pass a environment context to metastore thrift APIs
  • hive custom scripts do not work well if the data contains new lines
  • Make the new header for RC Files introduced in HIVE-2711 optional
  • Collect_set Aggregate does uneccesary check for value.
  • JDBC cannot find metadata for tables/columns containing uppercase character
  • Improve HiveMetaStore logging
  • add findbugs in build.xml
  • Add option to make multi inserts more atomic
  • Release codecs and output streams between flushes of RCFile
  • Typo in dynamic partitioning code bits, says "genereated" instead of "generated" in some places.
  • Add hive command for resetting hive confs
  • Support Bucketed mapjoin on partitioned table which has two or more partitions
  • BucketizedHiveInputFormat should be automatically used with SMBJoin
  • getting the reporter in the recordwriter
  • Enable Metastore audit logging for non-secure connections
  • Propagates filters which are on the join condition transitively
  • enum to string conversions
  • Create Table Like should copy configured Table Parameters
  • As a follow up for HIVE-3276, optimize union for dynamic partition queries
  • Keep the original query in HiveDriverRunHookContextImpl
  • get_json_object and json_tuple should use Jackson library
  • .23 compatibility: shim job.tracker.address
  • Add Retries to Hive MetaStore Connections
  • Yet better error message in CLI on invalid column name
  • All operators's conf should inherit from a common class
  • Support partial partition specifications in when enabling/disabling protections in Hive
  • perform a map-only group by if grouping key matches the sorting properties of the table
  • Provide backward compatibility for AvroSerDe properties
  • Hive maven-publish ant task should be configurable
  • To add instrumentation to capture if there is skew in reducers
  • Log client IP address with command in metastore's startFunction method
  • Allow Partition Offline Enable/Disable command to be specified at the ds level even when Partition is based on more columns than ds
  • Refactor Partition Pruner so that logic can be reused.
  • Storing certain Exception objects thrown in HiveMetaStore.java in MetaStoreEndFunctionContext
  • Early skipping for limit operator at reduce stage
  • Access to external URLs in hivetest.py
  • Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
  • Revert HIVE-3268
  • TCP KeepAlive and connection timeout for the HiveServer
  • Make prompt in Hive CLI configurable
  • Reset operator-id before executing parse tests
  • RetryingHMSHandler should wrap JDOException inside MetaException
  • Catch the NPe when using ^D to exit from CLI
  • getBoolVar in FileSinkOperator can be optimized
  • Round map/reduce progress down when it is in the range [99.5, 100)
  • New Feature:
  • Allow SELECT without a mapreduce job
  • Add SerDe for Avro serialized data
  • Implement "show create table"
  • Support with rollup option for group by
  • replace or translate function in hive
  • Implement SHOW TBLPROPERTIES
  • Support standard cross join syntax
  • Add FORMAT UDF
  • Optionally use framed transport with metastore
  • SHOW COLUMNS table_name; to provide a comma-delimited list of columns.
  • Support for Oracle-backed Hive-Metastore ("longvarchar" to "clob" in package.jdo)
  • Returning Meaningful Error Codes & Messages
  • Create a new metastore tool to bulk update location field in Db/Table/Partition records
  • Add the option -database DATABASE in hive cli to specify a default database to use for the cli session.
  • Add ability to export table metadata as JSON on table drop
  • Hive List Bucketing - DDL support
  • Skewed Join Optimization
  • Disallow certain character patterns in partition names
  • A table generating, table generating function
  • sort merge join should work if both the tables are sorted in descending order
  • Implement CUBE and ROLLUP operators in Hive
  • Implement grouping sets in hive
  • Hive List Bucketing - Query logic
  • Add a command "Explain dependency ..."
  • Hive List Bucketing - set hive.mapred.supports.subdirectories
  • Hive List Bucketing - enhance DDL to specify list bucketing table
  • Adding authorization capability to the metastore
  • Add support for phonetic algorithms in Hive
  • Task:
  • Move RegexSerDe out of hive-contrib and over to hive-serde
  • RCFileMergeMapper Prints To Standard Output Even In Silent Mode
  • Implement INCLUDE_HADOOP_MAJOR_VERSION test macro
  • Revert HIVE-2986
  • Add hive.exec.rcfile.use.explicit.header to hive-default.xml.template
  • hive.binary.record.max.length is a magic string
  • Extract global limit configuration to optimizer
  • Improve Performance of UDF PERCENTILE_APPROX()
  • Track table and keys used in joins and group bys for logging
  • Unescape partition names returned by show partitions
  • Update website with info on how to report security bugs
  • Test:
  • TestHiveServerSessions hangs when executed directly
  • TestRemoteHiveMetaStoreIpAddress always uses the same port
  • Stop testing concat of partitions containing control characters.
  • Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk
  • Add tests for 'm' bigs tables sortmerge join with 'n' small tables where both m,n>1
  • add tests to use bucketing metadata for partitions
  • Add more tests where output of sort merge join is sorted
  • New test cases added by HIVE-3676 in insert1.q is not deterministic
  • Wish:
  • Log Time To Submit metric with PerfLogger

New in Apache Hive 0.9.0 (May 21, 2013)

  • Sub-task:
  • add DOAP file for Hive
  • Enable/Add type-specific compression for rcfile
  • Move retry logic in HiveMetaStore to a separe class
  • Add support for filter pushdown for key ranges in hbase for keys of type string
  • Bug:
  • Hive Server getSchema() returns wrong schema for "Explain" queries
  • "hdfs" is hardcoded in few places in the code which inhibits use of other file systems
  • show functions also returns internal operators
  • Not using map aggregation, fails to execute group-by after cluster-by with same key
  • HiveServer should provide per session configuration
  • Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory
  • left semi join will duplicate data
  • Compact index table's files merged in creation
  • Passing user identity from metastore client to server in non-secure mode
  • Insert overwrite table db.tname fails if partition already exists
  • Describe partition returns table columns but should return partition columns
  • Make a single Hive binary work with both 0.20.x and 0.23.0
  • Make Hive work with Hadoop 1.0.0
  • ignore exception for external jars via reflection
  • wrong class loader used for external jars
  • Force Bash shell on parallel test slave nodes
  • Parallel tests fail if master directory is not present
  • Allow multiple ptest runs by the same person
  • Parallel test commands that include cd fail
  • "hive.querylog.location" requires parent directory to be exist or else folder creation fails
  • builtins JAR is not being published to Maven repo & hive-cli POM does not depend on it either
  • Need better exception handling in RCFile tolerate corruptions mode
  • StackOverflowError when using custom UDF in map join
  • Eclipse launch configurations fail due to unsatisfied builtins JAR dependency
  • get_partitions_ps throws TApplicationException if table doesn't exist
  • SUCESS is misspelled
  • a bug in 'alter table concatenate' that causes filenames getting double url encoded
  • SemanticAnalyzer twice swallows an exception it shouldn't
  • StackOverflowError when using custom UDF after adding archive after adding jars
  • Lots of special characters are not handled in LIKE
  • NPE in union followed by join
  • Remove unused lib/log4j-1.2.15.jar
  • Fix flaky testing infrastructure
  • Fix some nondeterministic test output
  • PlanUtils.configureTableJobPropertiesForStorageHandler() is not called for partitioned table
  • Single binary built against 0.20 and 0.23, does not work against 0.23 clusters.
  • Metastore client doesn't log properly in case of connection failure to server
  • CONV returns incorrect results sometimes
  • Hive multi group by single reducer optimization causes invalid column reference error
  • Remove empty java files
  • NPE in union with lateral view
  • union follwowed by union_subq does not work if the subquery union has reducers
  • Metastore is caching too aggressively
  • Change global_limit.q into linux format file
  • Remove lib/javaewah-0.3.jar
  • Alter Table Partition Concatenate Fails On Certain Characters
  • union with a multi-table insert is not working
  • make union31.q deterministic
  • Fail on table sampling
  • New BINARY type produces unexpected results with supported UDFS when using MapReduce2
  • filter is still removed due to regression of HIVE-1538 althougth HIVE-2344
  • SUBSTR(CAST( AS BINARY)) produces unexpected results
  • Disable loadpart_err.q on 0.23
  • Export LANG=en_US.UTF-8 to environment while running tests
  • typo in configuration parameter
  • TestContribCliDriver.dboutput and TestCliDriver.input45 fail on 0.23
  • Fix test failures caused by HIVE-2716
  • insert into external tables should not be allowed
  • cleanup readentity/writeentity
  • INPUT__FILE__NAME virtual column returns unqualified paths on Hadoop 0.23
  • Fix TestCliDriver escape1.q failure on MR2
  • QTestUtil.cleanUp() fails with FileNotException on 0.23
  • Ambiguous table name or column reference message displays when table and column names are the same
  • Renaming partition changes partition location prefix
  • Metastore client doesnt close connection properly
  • Hive union with NULL constant and string in same column returns all null
  • BlockMergeTask Doesn't Honor Job Configuration Properties when used directly
  • TestStatsPublisherEnhanced throws NPE on JDBC connection failure
  • testAclPositive in TestZooKeeperTokenStore failing in clean checkout when run on Mac
  • HiveFileFormatUtils should use Path.SEPARATOR instead of File.Separator
  • GROUP BY causing ClassCastException [LazyDioInteger cannot be cast LazyInteger]
  • several jars in hive tar generated are not required
  • JOIN + LATERAL VIEW + MAPJOIN fails to return result (seems to stop halfway through and no longer do the final reduce part)
  • Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data
  • TestCliDriver (script_pipe.q) failed with IBM JDK
  • Doc update for .8, .9 and .10
  • Improvement:
  • use sed rather than diff for masking out noise in diff-based tests
  • parallelize test query runs
  • Add java_method() as a synonym for the reflect() UDF
  • Extend concat_ws() UDF to support arrays of strings
  • When creating constant expression for numbers, try to infer type from another comparison operand, instead of trying to use integer first, and then long and double
  • Add timestamp column to the partition stats table.
  • pull junit jar from maven repos via ivy
  • Add target to install Hive JARs/POMs in the local Maven cache
  • Expose the HiveConf in HiveConnection API
  • Newly created partition should inherit properties from table
  • Make index table output of create index command if index is table based
  • move one line log from MapOperator to HiveContextAwareRecordReader
  • Add alterPartition to AlterHandler interface
  • fix Hive-2566 and make union optimization more aggressive
  • The variable hive.exec.mode.local.auto.tasks.max should be changed
  • Change arc config to hide generated files from Differential by default
  • Add Ant configuration property for dumping classpath of tests
  • Support for metastore service specific HADOOP_OPTS environment setting
  • The row count that loaded to a table may not right
  • Add 'ivy-clean-cache' and 'very-clean' Ant targets
  • Make ZooKeeper token store ACL configurable
  • Views should be added to the inputs of queries.
  • TestCliDriver should log elapsed time
  • Obtain delegation tokens for MR jobs in secure hbase setup
  • hbase handler uses ZooKeeperConnectionException which is not compatible with HBase versions other than 0.89
  • HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
  • Improve hooks run in Driver
  • HBaseSerDe should allow users to specify the timestamp passed to Puts
  • View partitions do not have a storage descriptor
  • Make the IP address of a Thrift client available to HMSHandler.
  • Add logging of total run time of Driver
  • Concatenating a partition does not inherit location from table
  • Implement nullsafe equi-join
  • Cache error messages for additional logging
  • Change default configuration for hive.exec.dynamic.partition
  • Fix javadoc warnings
  • Remove zero length files
  • Add pre event listeners to metastore
  • Cache remote map reduce job stack traces for additional logging
  • Support eventual constant expression for filter pushdown for key ranges in hbase
  • If hive history file's directory doesn't exist don't crash
  • hive-config.sh should honor HIVE_HOME env
  • Cache local map reduce job errors for additional logging
  • Add a new hook to run at the beginning and end of the Driver.run method
  • Store which configs the user has explicitly changed
  • Add "rat" target to build to look for missing license headers
  • Remove redundant key comparing in SMBMapJoinOperator
  • TextConverter for UDF's is inefficient if the input object is already Text or Lazy
  • Hive: Extend ALTER TABLE DROP PARTITION syntax to use all comparators
  • Add license to the Hive files
  • Hive metastore does not have any log messages while shutting itself down.
  • Remove need for storage descriptors for view partitions
  • Add support for filter pushdown for composite keys
  • New Feature
  • Allow access to Primitive types stored in binary format in HBase
  • Implement BETWEEN operator
  • Implement sort_array UDF
  • Add reset operation and average time attribute to Metrics MBean.
  • add support for insert partition overwrite(...) if not exists
  • support hive table/partitions exists in more than one region
  • Allow multiple group bys with the same input data and spray keys to be run on the same reducer.
  • Add PRINTF() Udf
  • Enable Hadoop-1.0.0 in Hive
  • Implement NULL-safe equality operator
  • Filter pushdown in hbase for keys stored in binary format
  • Closed range scans on hbase keys
  • Add JSON output to the hive ddl commands
  • RCFile Reader doesn't provide access to Metadata
  • Add nicer helper functions for adding and reading metadata from RCFiles
  • Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory
  • Task:
  • Hive Web Server startup messages logs incorrect path it is searching for WAR
  • Fix test failures caused by HIVE-2589
  • Upgrade Hbase and ZK dependcies
  • Add a getAuthorizationProvider to HiveStorageHandler
  • Move metastore upgrade scripts labeled 0.10.0 into scripts labeled 0.9.0
  • Remove unnecessary JAR dependencies
  • Revert HIVE-2612
  • Revert HIVE-2795
  • Row number issue in hive
  • Test:
  • Test ppr_pushdown.q is failing on trunk
  • add a testcase for partitioned view on union and base tables have index
  • Wish:
  • Clean-up logs

New in Apache Hive 0.8.0 (May 21, 2013)

  • New Feature:
  • Add TIMESTAMP column type for thrift dynamic_type
  • Support "INSERT [INTO] destination"
  • Triggers when a new partition is created for a table
  • Create a Hive CLI that connects to hive ThriftServer
  • Allow type widening on COALESCE/UNION ALL
  • Add support of columnar binary serde
  • optimize metadata only queries
  • Partitioning columns should be of primitive types only
  • add an interface in RCFile to support concatenation of two files without (de)compression
  • Allow users to specify LOCATION in CREATE DATABASE statement
  • Accelerate GROUP BY execution using indexes
  • Implement map_keys() and map_values() UDFs
  • Extend Explode UDTF to handle Maps
  • Implement bitmap indexing in Hive
  • Add export/import facilities to the hive system
  • support explicit view partitioning
  • Block merge for RCFile
  • Add "DROP DATABASE ... CASCADE/RESTRICT"
  • Input Sampling By Splits
  • extend table statistics to store the size of uncompressed data (+extend interfaces for collecting other types of statistics)
  • Add get_table_objects_by_name() to Hive MetaStore
  • Add api for marking / querying set of partitions for events
  • support grouping on complex types in Hive
  • Purge expired events
  • Cli: Print Hadoop's CPU milliseconds
  • Add a Plugin Developer Kit to Hive
  • add TIMESTAMP data type
  • Support archiving for multiple partitions if the table is partitioned by multiple columns
  • Add Binary Datatype in Hive
  • Allow Hive to be debugged remotely
  • Literal bigint
  • Allow UDFs to specify additional FILE/JAR resources necessary for execution
  • Bug;
  • better error code from Hive describe command
  • Join operation fails for some queries
  • Improve the error messages for missing/incorrect UDF/UDAF class
  • CREATE TABLE t LIKE some_view should create a new empty base table, but instead creates a copy of view
  • describe parse_url throws an error
  • Predicate push down get error result when sub-queries have the same alias name
  • Clean up references to 'hive.metastore.local'
  • FilterOperator is applied twice with ppd on.
  • ProxyFileSystem.close calls super.close twice.
  • job name for alter table archive partition is not correct
  • JDBC driver returns wrong precision, scale, or column size for some data types
  • SAXParseException on plan.xml during local mode.
  • Different defaults for hive.metastore.local
  • alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?)
  • Potential risk of resource leaks in Hive
  • DDLSemanticAnalyzer won't take newly set Hive parameters
  • Metastore operations (like drop_partition) could be improved in terms of maintaining consistency of metadata and data
  • Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.
  • Don't set ivy.home in build-common.xml
  • Auto convert mapjoin should not throw exception if the top operator is union operator.
  • Getting error when join on tables where name of table has uppercase letters
  • In error scenario some opened streams may not closed in ScriptOperator.java, Utilities.java
  • "insert overwrite directory" Not able to insert data with multi level directory path
  • Exception should be thrown when invalid jar,file,archive is given to add command
  • Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts
  • HWI admin_list_jobs JSP page throws exception
  • Make the delegation token issued by the MetaStore owned by the right user
  • Add inputs and outputs to authorization DDL commands
  • LOAD compilation does not set the outputs during semantic analysis resulting in no authorization checks being done for it.
  • keyword_1.q is failing
  • Making JDO thread-safe by default
  • In Driver.execute(), mapred.job.tracker is not restored if one of the task fails.
  • Fix TestEmbeddedHiveMetaStore and TestRemoteHiveMetaStore broken by HIVE-2022
  • Correct the exception message for the better traceability for the scenario load into the partitioned table having 2 partitions by specifying only one partition in the load statement.
  • create database does not honour warehouse.dir in dbproperties
  • A database's warehouse.dir is not used for tables created in it.
  • Backport HIVE-1991 after overridden by HIVE-1950
  • Merge result file size should honor hive.merge.size.per.task
  • the retry logic in Hive's concurrency is not working correctly.
  • In error scenario some opened streams may not closed
  • TCTLSeparatedProtocol.SimpleTransportTokenizer.nextToken() throws Null Pointer Exception in some cases
  • Exception on windows when using the jdbc driver. "IOException: The system cannot find the path specified"
  • CLI local mode hit NPE when exiting by ^D
  • Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
  • HivePreparedStatement.executeImmediate always throw exception
  • NullPointerException on getSchemas
  • Few code improvements in the ql and serde packages.
  • Bug: RowContainer was set to 1 in JoinUtils.
  • Add test coverage for external table data loss issue
  • auto convert map join bug
  • throw a error if the input is larger than a threshold for index input format
  • Make couple of convenience methods in EximUtil public
  • virtual column references inside subqueries cause execution exceptions
  • Log4J initialization info should not be printed out if -S is specified
  • In shell mode, local mode continues if a local-mode task throws exception in pre-hooks
  • insert overwrite ignoring partition location
  • auto convert map join may miss good candidates
  • Remove usage of deprecated methods from org.apache.hadoop.io package
  • alter table concatenate fails and deletes data
  • Bitmap Operation UDF doesn't clear return list
  • Exception when no splits returned from index
  • Jobs do not get killed even when they created too many files.
  • NPE during parsing order-by expression
  • Block Sampling should adjust number of reducers accordingly to make it useful
  • Too many open files in running negative cli tests
  • Stats JDBC LIKE queries should escape '_' and '%'
  • NPE in MapJoinObjectKey
  • TableSample(percent ) uses one intermediate size to be int, which overflows for large sampled size, making the sampling never triggered.
  • Few code improvements in the metastore,hwi and ql packages.
  • Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus
  • Log related Check style Comments fixes
  • Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.
  • Avoid null pointer exception when executing UDF
  • In Task class and its subclasses logger is initialized in constructor
  • Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
  • Dynamic Partitioning Failing because of characters not supported globStatus
  • Stats table schema incompatible after HIVE-2185
  • Ensure HiveConf includes all properties defined in hive-default.xml
  • SessionState used before ThreadLocal set
  • While using Hive in server mode, HiveConnection.close() is not cleaning up server side resources
  • incorrect success flag passed to jobClose
  • unable to get column names for a specific table that has '_' as part of its table name
  • Fix a bug caused by HIVE-243
  • CommandNeedRetryException.java is missing ASF header
  • runnable queue in Driver and DriverContext is not thread safe
  • hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
  • Can't publish maven release artifacts to apache repository
  • Comparison Operators convert number types to common type instead of double if possible
  • Merge failing of join tree in exceptional case
  • Enable TestHadoop20SAuthBridge
  • Skip comments in hive script
  • ExecDriver::addInputPaths should pass the table properties to the record writer
  • Revert HIVE-2219 and apply correct patch to improve the efficiency of dropping multiple partitions
  • Fix Inconsistency between RB and JIRA patches for HIVE-2194
  • Regression introduced from HIVE-2155
  • ClassCastException when building index with security.authorization turned on
  • Error during UNARCHIVE of a partition
  • Comment clause should immediately follow identifier field in CREATE DATABASE statement
  • Allow ShimLoader to work with Hadoop 0.20-append
  • bad compressed file names from insert into
  • Fix UDAFPercentile to tolerate null percentiles
  • files with control-A,B are not delimited correctly.
  • Schema creation scripts for PostgreSQL use bit(1) instead of boolean
  • Incorrect regular expression for extracting task id from filename
  • DatabaseMetadata.getColumns() does not return partition column names for a table
  • Calling alter_table after changing partition comment throws an exception
  • Add ColumnarSerDe to the list of native SerDes
  • Turn off bitmap indexing when map-side aggregation is turned off
  • hive.zookeeper.session.timeout is set to null in hive-default.xml
  • Turn off compression when generating index intermediate results
  • DESCRIBE TABLE causes NPE when hive.cli.print.header=true
  • Indexes are still automatically queried when out of sync with their source tables
  • Predicate pushdown erroneously conservative with outer joins
  • Alter table always throws an unhelpful error on failure
  • mirror.facebook.net is 404ing
  • stats not updated for non "load table desc" operations
  • filter is removed due to regression of HIVE-1538
  • Fix udtf_explode.q and udf_explode.q test failures
  • JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types
  • HiveConf properties not appearing in the output of 'set' or 'set -v'
  • Metastore upgrade scripts for HIVE-2246 do not migrate indexes nor rename the old COLUMNS table
  • Slow dropping of partitions caused by full listing of storage descriptors
  • Minor typo in error message in HiveConnection.java (JDBC)
  • Invalid predicate pushdown from incorrect column expression map for select operator generated by GROUP BY operation
  • Incorrect alias filtering for predicate pushdown
  • import of multiple partitions from a partitioned table with external location overwrites files
  • Add Mockito to LICENSE file
  • published POMs in Maven repo are incorrect
  • Fix whitespace test diff accidentally introduced in HIVE-1360
  • Hive server doesn't return schema for 'set' command
  • Function like with empty string is throwing null pointer exception
  • get_privilege does not get user level privilege
  • File extensions not preserved in Hive.checkPaths when renaming new destination file
  • Metastore server tries to connect to NN without authenticating itself
  • Update Eclipse configuration to include Mockito dependency
  • BlockMergeTask ignores client-specified jars
  • Merging of compressed rcfiles fails to write the valuebuffer part correctly
  • skip corruption bug that cause data not decompressed
  • upgrading thrift version didn't upgrade libthrift.jar symlink correctly
  • TABLESAMBLE(BUCKET xxx) sometimes doesn't trigger input pruning as regression of HIVE-1538
  • Pass correct remoteAddress in proxy user authentication
  • remove all @author tags from source
  • fix Eclipse for javaewah upgrade
  • Primitive Data Types returning null if the data is out of range of the data type.
  • mapjoin_subquery dump small table (mapjoin table) to the same file
  • Metastore statistics are not being updated for CTAS queries.
  • Hive PDK needs an Ivy configuration file
  • HadoopJobExecHelper does not handle null counters well
  • Phabricator for code review
  • Bug from HIVE-2446, the code that calls client stats publishers run() methods is in wrong place, should be in the same method but inside of while (!rj.isComplete()) {} loop
  • PDK tests failing on Hudson because HADOOP_HOME is not defined
  • PDK PluginTest failing on Hudson
  • partition pruning prune some right partition under specific conditions
  • small table filesize for automapjoin is not consistent in HiveConf.java and hive-default.xml
  • When new instance of Hive (class) is created, the current database is reset to default (current database shouldn't be changed).
  • Hive throws Null Pointer Exception upon CREATE TABLE . .... if the given doesn't exist
  • cleaunup QTestUtil: use test.data.files as current directory if one not specified
  • Dynamic partition insert should enforce the order of the partition spec is the same as the one in schema
  • HIVE-2446 bug (next one) - If constructor of ClientStatsPublisher throws runtime exception it will be propagated to HadoopJobExecHelper's progress method and beyond, whereas it shouldn't
  • Allow people to use only issue numbers without 'HIVE-' prefix with `arc diff --jira`.
  • Evaluation of non-deterministic/stateful UDFs should not be skipped even if constant oi is returned.
  • HiveIndexResult creation fails due to file system issue
  • Support scientific notation for Double literals
  • How to submit documentation fixes
  • Provide jira_base_url for improved arc commit workflow
  • upgrade script 008-HIVE-2246.mysql.sql contains syntax errors
  • HIVE-2247 Changed the Thrift API causing compatibility issues.
  • Add Java linter to Hive
  • HIVE-2246 upgrade script needs to drop foreign key in COLUMNS_OLD
  • eclipse template .classpath is broken
  • HIVE-2246 upgrade script changed the COLUMNS_V2.COMMENT length
  • ivy offline mode broken by changingPattern and checkmodified attributes
  • Debug mode in some situations doesn't work properly when child JVM is started from MapRedLocalTask
  • Hive build fails with error "java.io.IOException: Not in GZIP format"
  • explain task: getJSONPlan throws a NPE if the ast is null
  • bug in ivy 2.2.0 breaks build
  • Update arcconfig to include commit listener
  • HBase bulk load wiki page improvements
  • Update README.txt file to use description from wiki
  • HiveCli eclipse launch configuration hangs
  • Hive POMs reference the wrong Hadoop artifacts
  • Fix eclipse classpath template broken in HIVE-2523
  • Fix maven-build Ant target
  • TestHiveServer doesn't produce a JUnit report file
  • revert HIVE-2566
  • Recent patch prevents Hadoop confs from loading in 0.20.204
  • Improvement:
  • CREATE VIEW followup: CREATE OR REPLACE
  • Allow UDFs to access constant parameter values at compile time
  • increase hive.mapjoin.maxsize to 10 million
  • use filter pushdown for automatically accessing indexes
  • HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack
  • Improve miscellaneous error messages
  • support NOT IN and NOT LIKE syntax
  • HiveInputFormat.readFields should print the cause when there's an exception
  • Ctrl+c should kill currently running query, but not exit the CLI
  • The class HiveResultSet should implement batch fetching.
  • Task-cleanup task should be disabled
  • HIVE-78 Followup: group partitions by tables when do authorizations and there is no partition level privilege
  • Change Default Alias For Aggregated Columns (_c1)
  • mapjoin operator should not load hashtable for each new inputfile if the hashtable to be loaded is already there.
  • recognize transitivity of predicates on join keys
  • Hive Shell to output number of mappers and number of reducers
  • Support new annotation @UDFType(stateful = true)
  • adding comments to Hive Stats JDBC queries
  • Expand exceptions caught for metastore operations
  • avoid loading Hive aux jars in CLI remote mode
  • Create a separate namespace for Hive variables
  • Performance instruments for client side execution
  • isEmptyPath() to use ContentSummary cache
  • Use block-level merge for RCFile if merging intermediate results are needed
  • Update bitmap indexes for automatic usage
  • Metastore listener
  • remove hadoop version check from hive cli shell script
  • getInputSummary() to call FileSystem.getContentSummary() in parallel
  • PostHook and PreHook API to add flag to indicate it is pre or post hook plus cache for content summary
  • Generate single MR job for multi groupby query if hive.multigroupby.singlemr is enabled.
  • Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
  • SHOW GRANT grantTime field should be a human-readable timestamp
  • Reduce memory consumption in preparing MapReduce job
  • Increase the number of operator counter
  • No lock for some non-mapred tasks config variable hive.lock.mapred.only.operation added
  • Optimizer on partition field
  • Hive's symlink text input format should be able to work with ComineHiveInputFormat
  • Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
  • Automatic Indexing with multiple tables
  • DROP TABLE IF EXISTS should not fail if a view of that name exists
  • Remove System.exit
  • Enables HiveServer to accept -hiveconf option
  • reduce workload generated by JDBCStatsPublisher
  • Add api to send / receive message to metastore
  • Add interface classification in Hive.
  • add exception handling to hive's record reader
  • Improve error messages emitted during semantic analysis
  • Improve error messages emitted during task execution
  • Allow custom serdes to set field comments
  • Allow optional [inner] on equi-join.
  • Add actions for alter table and alter partition events for metastore event listeners
  • reduce name node calls in hive by creating temporary directories
  • create a new API in Warehouse where the root directory is specified
  • Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object
  • ALTER VIEW RENAME
  • Optimize partial specification metastore functions
  • add Query text for debugging in lock data
  • speedup addInputPaths
  • Make "alter table drop partition" more efficient
  • Provide metastore upgarde script for HIVE-2215
  • Ability to add partitions atomically
  • Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
  • Show current database in hive prompt
  • Make CombineHiveInputFormat the default hive.input.format
  • Dedupe tables' column schemas from partitions in the metastore db
  • Display a sample of partitions created when Fatal Error occurred due to too many partitioned created
  • Better error message in CLI on invalid column name
  • Local mode needs to work well with block sampling
  • bucketized map join should allow join key as a superset of bucketized columns
  • Improve error messages for DESCRIBE command
  • Optimize Hive query startup time for multiple partitions
  • Add hooks to run when execution fails.
  • Make Hadoop Job ID available after task finishes executing
  • Improve RCFile Read Speed
  • Support automatic rebuilding of indexes when they go stale
  • Make performance logging configurable.
  • Improve RCFileCat performance significantly
  • Warn user that precision is lost when bigint is implicitly cast to double.
  • Local Mode can be more aggressive if LIMIT optimization is on
  • RCFileReader Buffer Reuse
  • Allow RCFile Reader to tolerate corruptions
  • make hive mapper initialize faster when having tons of input files
  • The PerfLogger should log the full name of hooks, not just the simple name.
  • Introduction of client statistics publishers possibility
  • Add job ID to MapRedStats
  • Upgrade JavaEWAH to 0.3
  • move lock retry logic into ZooKeeperHiveLockManager
  • Need a way to categorize queries in hooks for improved logging
  • JDBCStatsAggregator DELETE STATEMENT should escape _ and %
  • Files in Avro-backed Hive tables do not have a ".avro" extension
  • Group-by query optimization Followup: add flag in conf/hive-default.xml
  • Add method to PerfLogger to perform cleanup/final steps.
  • make INNER a non-reserved keyword
  • HA Support for Metastore Server
  • Improve support for Constant Object Inspectors
  • Log more Hadoop task counter values in the MapRedStats class.
  • Enable ALTER TABLE SET SERDE to work on partition level
  • Update junit jar in testlibs
  • Get ConstantObjectInspectors working in UDAFs
  • Make Constant OIs work with UDTFs.
  • add a new builtins subproject
  • Consecutive string literals should be combined into a single string literal.
  • Use sorted nature of compact indexes
  • Make metastore log4j configuration file configurable again.
  • add explain formatted
  • Use hashing instead of list traversal for IN operator for primitive types
  • reduce the number map-reduce jobs for union all
  • Too much debugging info on console if a job failed
  • avoid referencing /tmp in tests
  • Setting no_drop on a table should cascade to child partitions
  • Add caching to json_tuple
  • Add hook to run in metastore's endFunction which can collect more fb303 counters
  • Task:
  • Hive in Maven
  • Provide Metastore upgrade scripts and default schemas for PostgreSQL
  • Remaining patch for HIVE-2148
  • Use the version commons-codec from Hadoop
  • Upgrade Hive's Thrift dependency to version 0.7.0
  • Metastore upgrade scripts for schema change introduced in HIVE-2215
  • Metastore upgrade script and schema DDL for Hive 0.8.0
  • Make Hive compile against Hadoop 0.23
  • Add pdk, hbase-handler etc as source dir in eclipse
  • Update wiki links in README file
  • Omit incomplete Postgres upgrade scripts from release tarball
  • Sub-task:
  • Support JDBC ResultSetMetadata
  • Bundle Log4j configuration files in Hive JARs
  • Push down partition pruning to JDO filtering for a subset of partition predicates
  • batch processing partition pruning process
  • Backward incompatibility introduced from HIVE-2082 in MetaStoreUtils.getPartSchemaFromTableSchema()
  • Partition Pruning bug in the case of hive.mapred.mode=nonstrict
  • Return correct Major / Minor version numbers for Hive Driver
  • add the HivePreparedStatement implementation based on current HIVE supported data-type
  • add a TM to Hive logo image
  • Update project naming and description in Hive wiki
  • Update project naming and description in Hive website
  • update project website navigation links
  • add trademark attributions to Hive homepage
  • Update project description and wiki link in ivy.xml files
  • Test:
  • Test that views with joins work properly
  • TestLazySimpleSerde fails randomly
  • create a test to verify that partition pruning works for partitioned views with a union
  • Wish;
  • ^C breaks out of running query, but not whole CLI

New in Apache Hive 0.7.0 (May 21, 2013)

  • New Feature:
  • Authorization infrastructure for Hive
  • Implement Indexing in Hive
  • Add reflect() UDF for reflective invocation of Java methods
  • Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
  • Authentication Infrastructure for Hive
  • Hive Variables
  • Concurrency Model for Hive
  • add row_sequence UDF
  • hive command line option -i to run an init file before other SQL commands
  • add option to let hive automatically run in local mode based on tunable heuristics
  • bring a table/partition offline
  • sentences() UDF for natural language tokenization
  • ngrams() UDAF for estimating top-k n-gram frequencies
  • Be able to modify a partition's fileformat and file location information.
  • context_ngrams() UDAF for estimating top-k contextual n-grams
  • Add json_tuple() UDTF function
  • Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
  • Add ANSI SQL correlation aggregate function CORR(X,Y).
  • Support partition filtering in metastore
  • Patch to allows scripts in S3 location
  • Implement "SHOW TABLES {FROM | IN} db_name"
  • parse_url_tuple: a UDTF version of parse_url
  • Default values for parameters
  • Implement GenericUDF str_to_map
  • Patch to support HAVING clause in Hive
  • track the joins which are being converted to map-join automatically
  • Call frequency and duration metrics for HiveMetaStore via jmx
  • maintain lastAccessTime in the metastore
  • Make Hive database data center aware
  • Add a new local mode flag in Task.
  • Better auto-complete for Hive
  • Support ALTER DATABASE to change database properties
  • Implement DROP TABLE/VIEW ... IF EXISTS
  • Implement DROP {PARTITION, INDEX, TEMPORARY FUNCTION} IF EXISTS
  • Make the MetaStore filesystem interface pluggable via the hive.metastore.fs.handler.class configuration property
  • add an option (hive.index.compact.file.ignore.hdfs) to ignore HDFS location stored in index files.
  • Verbose/echo mode for the Hive CLI
  • Improvement:
  • Provide option to export a HEADER
  • Support for distinct selection on two or more columns
  • describe extended table/partition output is cryptic
  • Missing some Jdbc functionality like getTables getColumns and HiveResultSet.get* methods based on column name.
  • Tapping logs from child processes
  • support filter pushdown against non-native tables
  • replace dependencies on HBase deprecated API
  • use Ivy for fetching HBase dependencies
  • Make Hive work with Hadoop security
  • Return value for map, array, and struct needs to return a string
  • do not update transient_lastDdlTime if the partition is modified by a housekeeping operation
  • automatically invoke .hiverc init script
  • add CLI command for executing a SQL script
  • serializing/deserializing the query plan is useless and expensive
  • Extend ivy offline mode to cover metastore downloads
  • Add support to turn off bucketing with ALTER TABLE
  • Speed up reflection method calls in GenericUDFBridge and GenericUDAFBridge
  • potentail NullPointerException
  • hive output file names are unnecessarily large
  • replace isArray() calls and remove LOG.isInfoEnabled() in Operator.forward()
  • supply correct information to hooks and lineage for index rebuild
  • support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
  • support IDXPROPERTIES on CREATE INDEX
  • Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
  • hive starter scripts should load admin/user supplied script for configurability
  • ability to select across a database
  • Use ZooKeeper from maven
  • Add support for JDBC PreparedStatements
  • Ability to plug custom Semantic Analyzers for Hive Grammar
  • CompactIndexInputFormat should create split only for files in the index output file.
  • regression and improvements in handling NULLs in joins
  • Add alternative search-provider to Hive site
  • Add ProtocolBuffersStructObjectInspector
  • ScriptOperator's AutoProgressor can lead to an infinite loop
  • Use CombineHiveInputFormat for the merge job if hive.merge.mapredfiles=true
  • convert commonly used udfs to generic udfs
  • add map joined table to distributed cache
  • Convert join queries to map-join based on size of table/row
  • ability to specify parent directory for zookeeper lock manager
  • Adding consistency check at jobClose() when committing dynamic partitions
  • Change get_partitions_ps to pass partition filter to database
  • FetchOperator.getInputFormatFromCache hides causal exception
  • drop support for pre-0.20 Hadoop versions
  • remove Hadoop 0.17 specific test reference logs
  • Optimize Key Comparison in GroupByOperator
  • Group-by to determine equals of Keys in reverse order
  • Support for using ALTER to set IDXPROPERTIES
  • ExecMapper and ExecReducer: reduce function calls to l4j.isInfoEnabled()
  • Remove Partition Filtering Conditions when Possible
  • Optimize ColumnarStructObjectInspector.getStructFieldData()
  • Remove JDBM component from Map Join
  • test cleanup for Hive-1641
  • optimize group by hash map memory
  • Support show locks for a particular table
  • Add queryid while locking
  • Update transident_lastDdlTime only if not specified
  • add more debug information for hive locking
  • CommonJoinOperator optimize the case of 1:1 join
  • change Pre/Post Query Hooks to take in 1 parameter: HookContext
  • Improve documentation for str_to_map() UDF
  • optimize the code path when there are no outer joins
  • dumps time at which lock was taken along with the queryid in show locks extended
  • Compressed the hashtable dump file before put into distributed cache
  • Clear empty files in Hive
  • HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
  • Show the time the local task takes
  • create a new ZooKeeper instance when retrying lock, and more info for debug
  • Add a option to run task to check map-join possibility in non-local mode
  • more debugging for locking
  • add an option in dynamic partition inserts to throw an error if 0 partitions are created
  • Reduce unnecessary DFSClient.rename() calls
  • Include Process ID in the log4j log file name
  • redo zookeeper hive lock manager
  • add a factory method for creating a synchronized wrapper for IMetaStoreClient
  • a mapper should be able to span multiple partitions
  • Store jobid in ExecDriver
  • Provide config parameters to control cache object pinning
  • Allow any type of stats publisher and aggregator in addition to HBase and JDBC
  • Find a way to disable owner grants
  • Improve the implementation of the METASTORE_CACHE_PINOBJTYPES config
  • Have audit logging in the Metastore
  • "Provide DFS initialization script for Hive
  • Make Stats gathering more flexible with timeout and atomicity
  • make a libthrift.jar and libfb303.jar in dist package for backward compatibility
  • Modify build to run all tests regardless of subproject failures
  • Hive SymlinkTextInputFormat does not estimate input size correctly
  • Bug:
  • "LOAD DATA LOCAL INPATH" fails when the table already contains a file of the same name
  • NULL is not handled correctly in join
  • HiveInputFormat.getInputFormatFromCache "swallows" cause exception when throwing IOExcpetion
  • add progress in join and groupby
  • Simple UDAFs with more than 1 parameter crash on empty row query
  • UDF field() doesn't work
  • Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
  • skip counter update when RunningJob.getCounters() returns null
  • FetchOperator(mapjoin) does not work with RCFile
  • bug in 'set fileformat'
  • Make Eclipse launch templates auto-adjust to Hive version number changes
  • Reporting progress in FileSinkOperator works in multiple directory case
  • hive-site.xml ${user.name} not replaced for local-file derby metastore connection URL
  • percentile_approx() fails with more than 1 reducer
  • CTAS should unescape the column name in the select-clause.
  • plan file should have a high replication factor
  • .gitignore files being placed in test warehouse directories causing build failure
  • TestCliDriver -Doverwrite=true does not put the file in the correct directory
  • fix or disable loadpart_err.q
  • Index followup: remove sort by clause and fix a bug in collect_set udaf
  • when generating reentrant INSERT for index rebuild, quote identifiers using backticks
  • Add cleanup method to HiveHistory class
  • Monitor the working set of the number of files
  • HiveCombineInputFormat should not use prefix matching to find the partitionDesc for a given path
  • hive.mapred.local.mem should only be used in case of local mode job submissions
  • ql tests no longer work in miniMR mode
  • Replace globStatus with listStatus inside Hive.java's replaceFiles.
  • Join filters do not work correctly with outer joins
  • alter partition should throw exception if the specified partition does not exist.
  • Unarchiving operation throws NPE
  • populate inputs and outputs for all statements
  • Fix TestContribCliDriver test
  • smb_mapjoin_8.q returns different results in miniMr mode
  • HBase tests broken
  • bucketizedhiveinputformat.q fails in minimr mode
  • referencing an added file by it's name in a transform script does not work in hive local mode
  • Add conf. property hive.exec.show.job.failure.debug.info to enable/disable displaying link to the task with most failures
  • cleanup ExecDriver.progress
  • Hive should not override Hadoop specific system properties
  • wrong log files in contrib client positive
  • Add HBase/ZK JARs to Eclipse classpath
  • udtf_explode.q is an empty file
  • use SequenceFile rather than TextFile format for hive query results
  • need to sort hook input/output lists for test result determinism
  • Hadoop 0.17 ant test broken by HIVE-1523
  • For a null value in a string column, JDBC driver returns the string "NULL"
  • Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
  • UDTF json_tuple should return null row when input is not a valid JSON string
  • Fix Base64TextInputFormat to be compatible with commons codec 1.4
  • Patch to fix hashCode method in DoubleWritable class
  • bug in NO_DROP
  • CombineHiveInputFormat fails with "cannot find dir for emptyFile"
  • ExecDriver.addInputPaths() error if partition name contains a comma
  • Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )
  • TestContribNegativeCliDriver fails
  • All TestJdbcDriver test cases fail in Eclipse unless a property is added in run config
  • join results are displayed wrongly for some complex joins using select *
  • Fix describe * [extended] column formatting
  • ql/src/java/org/apache/hadoop/hive/ql/parse/SamplePruner.java is empty
  • Eclipse build broken
  • MapJoin throws EOFExeption when the mapjoined table has 0 column selected
  • multithreading on Context.pathToCS
  • Create table bug causes the row format property lost when serde is specified.
  • count(*) returns wrong result when a mapper returns empty results
  • NPE in MapJoin
  • In the MapJoinOperator, the code uses tag as alias, which is not always true
  • ANALYZE TABLE command should check columns in partition spec
  • incorrect partition pruning ANALYZE TABLE
  • bug when different partitions are present in different dfs
  • CREATE TABLE LIKE should not set stats in the new table
  • Migrating metadata from derby to mysql thrown NullPointerException
  • duplicated MapRedTask in Multi-table inserts mixed with FileSinkOperator and ReduceSinkOperator
  • make TestHBaseCliDriver use dynamic ports to avoid conflicts with already-running services
  • ant clean should delete stats database
  • hbase_stats.q is failing
  • Two Bugs for Estimating Row Sizes in GroupByOperator
  • Fix Eclipse templates (and use Ivy metadata to generate Eclipse library dependencies)
  • Statistics broken for tables with size in excess of Integer.MAX_VALUE
  • HIVE 1633 hit for Stage2 jobs with CombineHiveInputFormat
  • failures in fatal.q in TestNegativeCliDriver
  • Many important broken links on Hive web page
  • Mismatched open/commit transaction calls in case of connection retry
  • Merge files does not work with dynamic partition
  • pcr.q output is non-deterministic
  • ROUND(infinity) chokes
  • Assertation on inputObjInspectors.length in Groupy operator
  • parallel execution and auto-local mode combine to place plan file in wrong file system
  • Outdated comments for GenericUDTF.close()
  • Typo in hive-default.xml
  • outputs not populated for dynamic partitions at compile time
  • GenericUDFOr and GenericUDFAnd cannot receive boolean typed object
  • outputs not correctly populated for alter table
  • Mapjoin will fail if there are no files associating with the join tables
  • The merge criteria on dynamic partitons should be per partiton
  • No Element found exception in BucketMapJoinOptimizer
  • bug in auto_join25.q
  • Hive comparison operators are broken for NaN values
  • spurious rmr failure messages when inserting with dynamic partitioning
  • show locks should not use getTable()/getPartition
  • Fix intermittent failures in TestRemoteMetaStore
  • mappers in group followed by joins may die OOM
  • Hanging hive client caused by TaskRunner's OutOfMemoryError
  • Some attributes in the Eclipse template file is deprecated
  • change hive assumption that local mode mappers/reducers always run in same jvm
  • bug in MAPJOIN
  • add more logging to partition pruning
  • downgrade JDO version
  • Temporarily disable metastore tests for listPartitionsByFilter()
  • mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message
  • Hive's smallint datatype is not supported by the Hive JDBC driver
  • Hive's float datatype is not supported by the Hive JDBC driver
  • Revive partition filtering in the Hive MetaStore
  • Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
  • test load_overwrite.q fails
  • Add mechanism for disabling tests with intermittent failures
  • TestRemoteHiveMetaStore.java accidentally deleted during commit of HIVE-1845
  • bug introduced by HIVE-1806
  • Fix 'tar' build target broken in HIVE-1526
  • fix HBase filter pushdown broken by HIVE-1638
  • Set the version of Hive trunk to '0.7.0-SNAPSHOT' to avoid confusing it with a release
  • HBase and Contrib JAR names are missing version numbers
  • Alter command execution "when HDFS is down" results in holding stale data in MetaStore
  • create script for the metastore upgrade due to HIVE-78
  • Can't join HBase tables if one's name is the beginning of the other
  • FileHandler leak on partial iteration of the resultset.
  • Double escaping special chars when removing old partitions in rmr
  • use partition level serde properties
  • failures in testhbaseclidriver
  • authorization on database level is broken.
  • CTAS (create-table-as-select) throws exception when showing results
  • Fix TestHadoop20SAuthBridge failure on Hudson
  • GRANT/REVOKE should handle privileges as tokens, not identifiers
  • alter table rename messes the location
  • hive.semantic.analyzer.hook cannot have multiple values
  • Fix test failure in TestContribCliDriver/url_hook.q
  • dynamic partition insert creating different directories for the same partition during merge
  • input16_cc.q is failing in testminimrclidriver
  • fix some outputs and make some tests deterministic
  • add fully deterministic ORDER BY in test union22.q and input40.q
  • TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
  • fix hbase_bulk.m by setting HiveInputFormat
  • TestHadoop20SAuthBridge failed on current trunk
  • Mismatched open/commit transaction calls when using get_partition()
  • Update README.txt and add missing ASF headers
  • Executing queries using Hive Server is not logging to the log file specified in hive-log4j.properties
  • Improve naming and README files for MetaStore upgrade scripts
  • upgrade-0.6.0.mysql.sql script attempts to increase size of PK COLUMNS.TYPE_NAME to 4000
  • Add datanucleus.identifierFactory property to HiveConf to avoid unintentional MetaStore Schema corruption
  • Make call to SecurityUtil.getServerPrincipal unambiguous
  • Sub-task:
  • table/partition level statistics
  • Add delegation token support to metastore
  • a followup patch for changing the description of hive.exec.pre/post.hooks in conf/hive-default.xml
  • upgrade the database thrift interface to allow parameters key-value pairs
  • Extend the CREATE DATABASE command with DBPROPERTIES
  • Add the local flag to all the map red tasks, if the query is running locally.
  • Task:
  • Hive should depend on a release version of Thrift
  • Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
  • Update Metastore upgrade scripts to handle schema changes introduced in HIVE-1413
  • Remove CHANGES.txt
  • Create MetaStore schema upgrade scripts for changes made in HIVE-417
  • Provide MetaStore schema upgrade scripts for changes made in HIVE-1823
  • Test:
  • improve test query performance
  • JDBM diff in test caused by Hive-1641
  • merge_dynamic_part's result is not deterministic
  • change the value of hive.input.format to CombineHiveInputFormat for tests

New in Apache Hive 0.6.0 (May 21, 2013)

  • New Feature
  • Add PERCENTILE aggregate function
  • add database/schema support Hive QL
  • Hive HBase Integration (umbrella)
  • row-wise IN would be useful
  • CommandProcessor should return DriverResponse
  • add udaf max_n, min_n to contrib
  • Bucketed Map Join
  • support views
  • multi-partition inserts
  • Create UDFs for XPath expression evaluation
  • Better Error Messages for Execution Errors
  • Let user script write out binary data into a table
  • CombinedHiveInputFormat for hadoop 19
  • Add UDF to create struct
  • Add column lineage information to the pre execution hooks
  • Add metastore API method to get partition by name
  • bucketing mapjoin where the big table contains more than 1 big partition
  • enforce bucketing for a table
  • Add UDF array_contains
  • ensure sorting properties for a table
  • sorted merge join
  • create a new input format where a mapper spans a file
  • More robust handling of metastore connection failures
  • Get partitions with a partial specification
  • Add mathematical UDFs PI, E, degrees, radians, tan, sign, and atan
  • Thread pool size in Thrift metastore server should be configurable
  • Add SymlinkTextInputFormat to Hive
  • Partition name to values conversion conversion method
  • More generic and efficient merge method
  • Archiving partitions
  • Tool to cat rcfiles
  • histogram() UDAF for a numerical column
  • Web Interface can ony browse default
  • Add TCP keepalive option for the metastore server
  • Alter the number of buckets for a table
  • Bug
  • support count(*) and count distinct on multiple columns
  • getSchema returns invalid column names, getThriftSchema does not return old style string schemas
  • GenericUDTFExplode() throws NPE when given nulls
  • desc Table should work
  • typedbytes does not support nulls
  • function in a transform with more than 1 argument fails
  • Predicate push down does not work with UDTF's
  • NPE when operating HiveCLI in distributed mode
  • TestContribCliDriver failure in serde_typedbytes.q, serde_typedbytes2.q, and serde_typedbytes3.q
  • Make it possible for users to recover data when moveTask fails
  • ColumnarSerde should not be the default Serde when user specified a fileformat using 'stored as'.
  • Add "-Doffline=true" option to ant
  • Skew Join does not work in distributed env.
  • Conditional task does not increase finished job counter when filter job out.
  • Disable streaming last table if there is a skew key in previous tables.
  • bug with alter table rename when table has property EXTERNAL=FALSE
  • create view should expand the query text consistently
  • Hive CLI shows 'Ended Job=' at the beginning of the job
  • Assertion in ExecDriver.execute when assertions are enabled in HADOOP_OPTS
  • "datanucleus" typos in conf/hive-default.xml
  • Use TreeMap instead of Property to make explain extended deterministic
  • Job counter error if "hive.merge.mapfiles" equals true
  • 'create if not exists' fails for a table name with 'select' in it
  • Expression Not In Group By Key error is sometimes masked
  • Fix RCFile resource leak when opening a non-RCFile
  • Increase ObjectInspector[] length on demand
  • Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
  • typedbytes: writing to stderr kills the mapper
  • RowContainer should flush out dummy rows when the table desc is null
  • ScriptOperator AutoProgressor does not set the interval
  • CombineHiveInputFormat does not work for compressed text files
  • hints cannot be passed to transform statements
  • Task breaking bug when breaking after a filter operator
  • date_sub() function returns wrong date because of daylight saving time difference
  • joins between HBase tables and other tables (whether HBase or not) are broken
  • set merge files to files when bucketing/sorting is being enforced
  • ql.metadata.Hive#close() should check for null metaStoreClient
  • Cannot start metastore thrift server on a specific port
  • Case sensitiveness of type information specified when using custom reducer causes type mismatch
  • UDF_Percentile NullPointerException
  • bug in sort merge join if the big table does not have any row
  • TestHBaseCliDriver hangs
  • Select query with specific projection(s) fails if the local file system directory for ${hive.user.scratchdir} does not exist.
  • problem in combinehiveinputformat with nested directories
  • Bucketing column names in create table should be case-insensitive
  • error/info message being emitted on standard output
  • sort merge join does not work with bucketizedhiveinputformat
  • Fix UDAFPercentile ndexOutOfBoundsException
  • HIVE_AUX_JARS_PATH interferes with startup of Hive Web Interface
  • unit test symlink_text_input_format.q needs ORDER BY for determinism
  • = throws NPE
  • bug is use of hadoop supports splittable
  • hive trunk does not compile with hadoop 0.17 any more
  • bucketed sort merge join breaks after dynamic partition insert
  • CombineHiveInputFormat throws exception when partition name contains special characters to URI
  • NPE with lineage in a query of union alls on joins.
  • bugs with temp directories, trailing blank fields in HBase bulk load
  • Cached FileSystem can lead to persistant IOExceptions
  • leading dash in partition name is not handled properly
  • dynamic partition insert should throw an exception if the number of target table columns + dynamic partition columns does not equal to the number of select columns
  • RowContainer uses hard-coded '/tmp/' path for temporary files
  • Group by partition column returns wrong results
  • fatal error check omitted for reducer-side operators
  • select * does not work if different partitions contain different formats
  • Fix bin/ext/jar.sh to work with hadoop 0.20 and above
  • Filter Operator Column Pruning should preserve the column order
  • TypedBytesSerDe fails to create table with multiple columns.
  • hive.query.id is not unique
  • rcfilecat should use '\t' to separate columns and print '\r\n' at the end of each row.
  • load_dyn_part*.q tests need ORDER BY for determinism
  • partition level properties honored if it exists
  • Increase the maximum length of various metastore fields, and remove TYPE_NAME from COLUMNS primary key
  • Bug in SMBJoinOperator which may causes a final part of the results in some cases.
  • inputFileFormat error if the merge job takes a different input file format than the default output file format
  • remove blank in rcfilecat
  • Missing connection pool plugin in Eclipse classpath
  • getPartitionDescFromPath() in CombineHiveInputFormat should handle matching by path
  • combinehiveinputformat does not work if files are of different types
  • Reporting progress to JT during closing files in FileSinkOperator
  • Add hadoop-*-tools.jar to Eclipse classpath
  • File format information is retrieved from first partition
  • DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH
  • CombineHiveInputFormat bug on tablesample
  • Archived partitions throw error with queries calling getContentSummary
  • column pruning not working with lateral view
  • problem with sequence and rcfiles are mixed for null partitions
  • problem with sequence and rcfiles are mixed for null partitions
  • hive.task.progress should be added to conf/hive-default.xml
  • ALTER TABLE ADD PARTITION fails with a remote Thrift metastore
  • Upgraded naming scheme causes JDO exceptions
  • bug in 'set fileformat'
  • insert overwrite and CTAS fail in hive local mode
  • lateral view does not work with column pruning
  • FileSinkOperator should remove duplicated files from the same task based on file sizes
  • parallel execution failed if mapred.job.name is set
  • Typo of hive.merge.size.smallfiles.avgsize prevents change of value
  • hive --service jar looks for hadoop version but was not defined
  • Web Interface JSP needs Refactoring for removed meta store methods
  • ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
  • Migration scripts should increase size of PARAM_VALUE in PARTITION_PARAMS
  • Improvement
  • provide option to run hive in local mode
  • handle skewed keys for a join in a separate job
  • Incorporate CheckStyle into Hive's build.xml
  • Merge tasks in GenMRUnion1
  • CREATE VIEW followup: add a "table type" enum attribute in metastore's MTable, and also null out irrelevant attributes for MTable instances which describe views
  • CREATE VIEW followup: find and document current expected version of thrift, and regenerate code to match
  • Add a "skew join map join size" variable to control the input size of skew join's following map join job.
  • make number of concurrent tasks configurable
  • QueryPlan to be independent from BaseSemanticAnalyzer
  • Structured temporary directories
  • add counters to show that skew join triggered
  • Make QueryPlan serializable
  • Add hive.merge.size.per.task to HiveConf
  • Make all Tasks and Works serializable
  • In ivy offline mode, don't delete downloaded jars
  • Make ql/metadata/Table and Partition serializable
  • Let max/min handle complex types like struct
  • add type-checking setters for HiveConf class to match existing getters
  • CREATE VIEW followup: support ALTER TABLE SET TBLPROPERTIES on views
  • Add comment to explain why we check for dir first in add_partitions().
  • Add metastore API method to drop partition / append partition by name
  • drop_partition_by_name() should use drop_partition_common()
  • Configure build to download Hadoop tarballs from Facebook mirror instead of Apache
  • When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
  • Explicitly say "Hive Internal Error" to ease debugging
  • Show the row with error in mapper/reducer
  • accept TBLPROPERTIES on CREATE TABLE/VIEW
  • allow HBase key column to be anywhere in Hive table
  • add pre-drops in bucketmapjoin*.q
  • add backward-compatibility constructor to HiveMetaStoreClient
  • mapjoin followed by another mapjoin should be performed in a single query
  • from_unixtime should implment a overloading function to accept only bigint type
  • optimize bucketing
  • facilitate HBase bulk loads from Hive
  • CLI set and set -v commands should dump properties in alphabetical order
  • error message in Hive.checkPaths dumps Java array address instead of path string
  • support: alter table touch partition
  • cleanup the jobscratchdir
  • Increase the memory limit for CLI client
  • make mapred.input.dir.recursive work for select *
  • for ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'), change TBL_TYPE attribute from MANAGED_TABLE to EXTERNAL_TABLE
  • DataNucleus should use connection pooling
  • Moving inputFileChanged() from ExecMapper to where it is needed
  • Do not pull counters of non initialized jobs
  • Hive should use NullOutputFormat for hadoop jobs
  • CombineHiveInputSplit should initialize the inputFileFormat once for a single split
  • New algorithm for variance() UDAF
  • allow HBase WAL to be disabled
  • Add PERCENTILE_APPROX which works with double data type
  • Make Hive build work with Ivy versions < 2.1.0
  • set abort in ExecMapper when Hive's record reader got an IOException
  • Make the compile target depend on thrift.home
  • Task
  • Automated source code cleanup
  • Cleanup Class names
  • Add .gitignore file
  • Suppress Checkstyle warnings for generated files
  • Replace instances of StringBuffer/Vector with StringBuilder/ArrayList
  • Checkstyle fixes
  • Use Anakia for version controlled documentation
  • build references IVY_HOME incorrectly
  • Update Eclipse project configuration to match Checkstyle
  • Eclipse launchtemplate changes to enable debugging
  • fix Hive logo img tag to avoid stretching
  • Provide metastore schema migration scripts (0.5 -> 0.6)
  • Provide Postgres metastore schema migration scripts (0.5 -> 0.6)
  • Include metastore upgrade scripts in release tarball
  • Update README file for 0.6.0 release
  • Satisfy ASF release management requirements
  • Sub-task
  • checking VOID type for NULL in LazyBinarySerde
  • Test
  • NPE when running TestJdbcDriver/TestHiveServer
  • test HBase input format plus CombinedHiveInputFormat
  • temporarily disable HBase test execution
  • Unit test should be shim-aware