Apache Hive Changelog

What's new in Apache Hive 1.2.0

May 19, 2015

Sub-task:
[HIVE-8119] - Implement Date in ParquetSerde
[HIVE-8164] - Adding in a ReplicationTask that converts a Notification Event to actionable tasks
[HIVE-8165] - Annotation changes for replication
[HIVE-8379] - NanoTimeUtils performs some work needlessly
[HIVE-8696] - HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[HIVE-8817] - Create unit test where we insert into an encrypted table and then read from it with pig
[HIVE-8818] - Create unit test where we insert into an encrypted table and then read from it with hcatalog mapreduce
[HIVE-9009] - order by (limit) meaning for the last subquery of union in Hive is different from other main stream RDBMS
[HIVE-9253] - MetaStore server should support timeout for long running requests
[HIVE-9271] - Add ability for client to request metastore to fire an event
[HIVE-9273] - Add option to fire metastore event on insert
[HIVE-9327] - CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
[HIVE-9333] - Move parquet serialize implementation to DataWritableWriter to improve write speeds
[HIVE-9432] - CBO (Calcite Return Path): Removing QB from ParseContext
[HIVE-9501] - DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
[HIVE-9508] - MetaStore client socket connection should have a lifetime
[HIVE-9550] - ObjectStore.getNextNotification() can return events inside NotificationEventResponse as null which conflicts with its thrift "required" tag
[HIVE-9558] - [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode
[HIVE-9563] - CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]
[HIVE-9571] - CBO (Calcite Return Path): Generate FileSink Op [CBO branch]
[HIVE-9582] - HCatalog should use IMetaStoreClient interface
[HIVE-9585] - AlterPartitionMessage should return getKeyValues instead of getValues
[HIVE-9657] - Use new parquet Types API builder to construct data types
[HIVE-9666] - Improve some qtests
[HIVE-9690] - Refactoring for non-numeric arithmetic operations
[HIVE-9750] - avoid log locks in operators
[HIVE-9792] - Support interval type in expressions/predicates
[HIVE-9810] - prep object registry for multi threading
[HIVE-9819] - Add timeout check inside the HMS server
[HIVE-9824] - LLAP: Native Vectorization of Map Join
[HIVE-9894] - Use new parquet Types API builder to construct DATE data type
[HIVE-9906] - Add timeout mechanism in RawStoreProxy
[HIVE-9937] - LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[HIVE-9982] - CBO (Calcite Return Path): Prune TS Relnode schema
[HIVE-9998] - Vectorization support for interval types
[HIVE-10037] - JDBC support for interval expressions
[HIVE-10044] - Allow interval params for year/month/day/hour/minute/second functions
[HIVE-10053] - Override new init API fom ReadSupport instead of the deprecated one
[HIVE-10071] - CBO (Calcite Return Path): Join to MultiJoin rule
[HIVE-10076] - Bump up parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6
[HIVE-10131] - LLAP: BytesBytesMultiHashMap and mapjoin container should reuse refs
[HIVE-10227] - Concrete implementation of Export/Import based ReplicationTaskFactory
[HIVE-10228] - Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[HIVE-10243] - CBO (Calcite Return Path): Introduce JoinAlgorithm Interface
[HIVE-10252] - Make PPD work for Parquet in row group level
[HIVE-10262] - CBO (Calcite Return Path): Temporarily disable Aggregate check input for bucketing
[HIVE-10263] - CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional
[HIVE-10326] - CBO (Calcite Return Path): Invoke Hive's Cumulative Cost
[HIVE-10329] - Hadoop reflectionutils has issues
[HIVE-10343] - CBO (Calcite Return Path): Parameterize algorithm cost model
[HIVE-10347] - Merge spark to trunk 4/15/2015
[HIVE-10350] - CBO: Use total size instead of bucket count to determine number of splits & parallelism
[HIVE-10369] - CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled
[HIVE-10375] - CBO (Calcite Return Path): disable the identity project remover for some union operators
[HIVE-10386] - CBO (Calcite Return Path): Disable Trivial Project Removal on ret path
[HIVE-10391] - CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column
[HIVE-10400] - CBO (Calcite Return Path): Exception when column name contains dot or colon characters
[HIVE-10413] - [CBO] Return path assumes distinct column cant be same as grouping column
[HIVE-10416] - CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite
[HIVE-10426] - Rework/simplify ReplicationTaskFactory instantiation
[HIVE-10455] - CBO (Calcite Return Path): Different data types at Reducer before JoinOp
[HIVE-10462] - CBO (Calcite Return Path): MapJoin and SMBJoin conversion not triggered
[HIVE-10493] - Merge multiple joins when join keys are the same
[HIVE-10506] - CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
[HIVE-10512] - CBO (Calcite Return Path): SMBJoin conversion throws ClassCastException
[HIVE-10520] - LLAP: Must reset small table result columns for Native Vectorization of Map Join
[HIVE-10522] - CBO (Calcite Return Path): fix the wrong needed column names when TS is created
[HIVE-10526] - CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account
[HIVE-10547] - CBO (Calcite Return Path) : genFileSinkPlan uses wrong partition col to create FS
[HIVE-10549] - CBO (Calcite Return Path): Enable NonBlockingOpDeDupProc
Bug:
[HIVE-3454] - Problem with CAST(BIGINT as TIMESTAMP)
[HIVE-4625] - HS2 should not attempt to get delegation token from metastore if using embedded metastore
[HIVE-5545] - HCatRecord getInteger method returns String when used on Partition columns of type INT
[HIVE-5672] - Insert with custom separator not supported for non-local directory
[HIVE-6069] - Improve error message in GenericUDFRound
[HIVE-6099] - Multi insert does not work properly with distinct count
[HIVE-6950] - Parsing Error in GROUPING SETS
[HIVE-7351] - ANALYZE TABLE statement fails on postgres metastore
[HIVE-7641] - INSERT ... SELECT with no source table leads to NPE
[HIVE-8524] - When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS
[HIVE-8626] - Extend HDFS super-user checks to dropPartitions
[HIVE-8746] - ORC timestamp columns are sensitive to daylight savings time
[HIVE-8890] - HiveServer2 dynamic service discovery: use persistent ephemeral nodes curator recipe
[HIVE-8915] - Log file explosion due to non-existence of COMPACTION_QUEUE table
[HIVE-9002] - union all does not generate correct result for order by and limit
[HIVE-9023] - HiveHistoryImpl relies on removed counters to print num rows
[HIVE-9073] - NPE when using custom windowing UDAFs
[HIVE-9083] - New metastore API to support to purge partition-data directly in dropPartitions().
[HIVE-9086] - Add language support to PURGE data while dropping partitions.
[HIVE-9115] - Hive build failure on hadoop-2.7 due to HADOOP-11356
[HIVE-9118] - Support auto-purge for tables, when dropping tables/partitions.
[HIVE-9151] - Checking s against null in TezJobMonitor#getNameWithProgress() should be done earlier
[HIVE-9228] - Problem with subquery using windowing functions
[HIVE-9303] - Parquet files are written with incorrect definition levels
[HIVE-9322] - Make null-checks consistent for MapObjectInspector subclasses.
[HIVE-9350] - Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'
[HIVE-9397] - SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
[HIVE-9430] - NullPointerException on ALTER TABLE ADD PARTITION if no value given
[HIVE-9438] - The standalone-jdbc jar missing some jars
[HIVE-9456] - Make Hive support unicode with MSSQL as Metastore backend
[HIVE-9468] - Test groupby3_map_skew.q fails due to decimal precision difference
[HIVE-9471] - Bad seek in uncompressed ORC, at row-group boundary.
[HIVE-9472] - Implement 7 simple UDFs added to Hive
[HIVE-9474] - truncate table changes permissions on the target
[HIVE-9481] - allow column list specification in INSERT statement
[HIVE-9482] - Hive parquet timestamp compatibility
[HIVE-9484] - ThriftCLIService#getDelegationToken does case sensitive comparison
[HIVE-9486] - Use session classloader instead of application loader
[HIVE-9489] - add javadoc for UDFType annotation
[HIVE-9496] - Slf4j warning in hive command
[HIVE-9507] - Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[HIVE-9509] - Restore partition spec validation removed by HIVE-9445
[HIVE-9512] - HIVE-9327 causing regression in stats annotation
[HIVE-9513] - NULL POINTER EXCEPTION
[HIVE-9526] - ClassCastException thrown by HiveStatement
[HIVE-9529] - "alter table .. concatenate" under Tez mode should create TezTask
[HIVE-9539] - Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[HIVE-9553] - Fix log-line in Partition Pruner
[HIVE-9555] - assorted ORC refactorings for LLAP on trunk
[HIVE-9560] - When hive.stats.collect.rawdatasize=true, 'rawDataSize' for an ORC table will result in value '0' after running 'analyze table TABLE_NAME compute statistics;'
[HIVE-9565] - Minor cleanup in TestMetastoreExpr.
[HIVE-9567] - JSON SerDe not escaping special chars when writing char/varchar data
[HIVE-9580] - Server returns incorrect result from JOIN ON VARCHAR columns
[HIVE-9587] - UDF decode should accept STRING_GROUP types for the second parameter
[HIVE-9588] - Reimplement HCatClientHMSImpl.dropPartitions() with HMSC.dropPartitions()
[HIVE-9592] - fix ArrayIndexOutOfBoundsException in date_add and date_sub initialize
[HIVE-9609] - AddPartitionMessage.getPartitions() can return null
[HIVE-9612] - Turn off DEBUG logging for Lazy Objects for tests
[HIVE-9613] - Left join query plan outputs wrong column when using subquery
[HIVE-9617] - UDF from_utc_timestamp throws NPE if the second argument is null
[HIVE-9619] - Uninitialized read of numBitVectors in NumDistinctValueEstimator
[HIVE-9620] - Cannot retrieve column statistics using HMS API if column name contains uppercase characters
[HIVE-9622] - Getting NPE when trying to restart HS2 when metastore is configured to use org.apache.hadoop.hive.thrift.DBTokenStore
[HIVE-9623] - NullPointerException in MapJoinOperator.processOp(MapJoinOperator.java:253) for TPC-DS Q75 against un-partitioned schema
[HIVE-9624] - NullPointerException in MapJoinOperator.processOp(MapJoinOperator.java:253) for TPC-DS Q75 against un-partitioned schema
[HIVE-9628] - HiveMetaStoreClient.dropPartitions(...List...) doesn't take (boolean needResult)
[HIVE-9633] - Add HCatClient.dropPartitions() overload to skip deletion of partition-directories.
[HIVE-9644] - Fold case & when udfs
[HIVE-9645] - Constant folding case NULL equality
[HIVE-9647] - Discrepancy in cardinality estimates between partitioned and un-partitioned tables
[HIVE-9648] - Null check key provider before doing set
[HIVE-9652] - Tez in place updates should detect redirection of STDERR
[HIVE-9655] - Dynamic partition table insertion error
[HIVE-9665] - Parallel move task optimization causes race condition
[HIVE-9667] - Disable ORC bloom filters for ORC v11 output-format
[HIVE-9674] - *DropPartitionEvent should handle partition-sets.
[HIVE-9679] - Remove redundant null-checks from DbNotificationListener.
[HIVE-9680] - GlobalLimitOptimizer is not checking filters correctly
[HIVE-9681] - Extend HiveAuthorizationProvider to support partition-sets.
[HIVE-9706] - HBase handler support for snapshots should confirm properties before use
[HIVE-9711] - ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN
[HIVE-9716] - Map job fails when table's LOCATION does not have scheme
[HIVE-9717] - The max/min function used by AggrStats for decimal type is not what we expected
[HIVE-9720] - Metastore does not properly migrate column stats when renaming a table across databases.
[HIVE-9721] - Hadoop23Shims.setFullFileStatus should check for null
[HIVE-9727] - GroupingID translation from Calcite
[HIVE-9731] - WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified
[HIVE-9734] - Correlating expression cannot contain unqualified column references
[HIVE-9735] - aggregate ( smalllint ) fails when ORC file used ava.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Short
[HIVE-9743] - Incorrect result set for vectorized left outer join
[HIVE-9749] - ObjectStore schema verification logic is incorrect
[HIVE-9754] - rename GenericUDFLevenstein to GenericUDFLevenshtein
[HIVE-9755] - Hive built-in "ngram" UDAF fails when a mapper has no matches.
[HIVE-9767] - Fixes in Hive UDF to be usable in Pig
[HIVE-9770] - Beeline ignores --showHeader for non-tablular output formats i.e csv,tsv,dsv
[HIVE-9772] - Hive parquet timestamp conversion doesn't work with new Parquet
[HIVE-9779] - ATSHook does not log the end user if doAs=false (it logs the hs2 server user)
[HIVE-9791] - insert into table throws NPE
[HIVE-9797] - Need update some spark tests for java 8
[HIVE-9813] - Hive JDBC - DatabaseMetaData.getColumns method cannot find classes added with "add jar" command
[HIVE-9817] - fix DateFormat pattern in hive-exec
[HIVE-9826] - Firing insert event fails on temporary table
[HIVE-9831] - HiveServer2 should use ConcurrentHashMap in ThreadFactory
[HIVE-9832] - Merge join followed by union and a map join in hive on tez fails.
[HIVE-9834] - VectorGroupByOperator logs too much
[HIVE-9836] - Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns)
[HIVE-9839] - HiveServer2 leaks OperationHandle on async queries which fail at compile phase
[HIVE-9841] - IOException thrown by ORC should include the path of processing file
[HIVE-9845] - HCatSplit repeats information making input split data size huge
[HIVE-9848] - readlink -f is GNU coreutils only (used in bin/hive)
[HIVE-9851] - org.apache.hadoop.hive.serde2.avro.AvroSerializer should use org.apache.avro.generic.GenericData.Array when serializing a list
[HIVE-9855] - Runtime skew join doesn't work when skewed data only exists in big table
[HIVE-9860] - MapredLocalTask/SecureCmdDoAs leaks local files
[HIVE-9866] - Changing a column's type doesn't change column stats type in metastore
[HIVE-9869] - Trunk doesn't build with hadoop-1
[HIVE-9873] - Hive on MR throws DeprecatedParquetHiveInput exception
[HIVE-9877] - Beeline cannot run multiple statements in the same row
[HIVE-9886] - Hive on tez: NPE when converting join to SMB in sub-query
[HIVE-9892] - various MSSQL upgrade scripts don't work
[HIVE-9908] - vectorization error binary type not supported, group by with binary columns
[HIVE-9915] - Allow specifying file format for managed tables
[HIVE-9919] - upgrade scripts don't work on some auto-created DBs due to absence of tables
[HIVE-9920] - DROP DATABASE IF EXISTS throws exception if database does not exist
[HIVE-9923] - No clear message when "from" is missing
[HIVE-9929] - StatsUtil#getAvailableMemory could return negative value
[HIVE-9930] - fix QueryPlan.makeQueryId time format
[HIVE-9932] - DDLTask.conf hides base class Task.conf
[HIVE-9934] - Vulnerability in LdapAuthenticationProviderImpl enables HiveServer2 client to degrade the authentication mechanism to "none", allowing authentication without password
[HIVE-9936] - fix potential NPE in DefaultUDAFEvaluatorResolver
[HIVE-9944] - Convert array[] to string properly in log messages
[HIVE-9945] - FunctionTask.conf hides Task.conf field
[HIVE-9947] - ScriptOperator replaceAll uses unescaped dot and result is not assigned
[HIVE-9948] - SparkUtilities.getFileName passes File.separator to String.split() method
[HIVE-9950] - fix rehash in CuckooSetBytes and CuckooSetLong
[HIVE-9951] - VectorizedRCFileRecordReader creates Exception but does not throw it
[HIVE-9952] - fix NPE in CorrelationUtilities
[HIVE-9953] - fix NPE in WindowingTableFunction
[HIVE-9954] - UDFJson uses the == operator to compare Strings
[HIVE-9955] - TestVectorizedRowBatchCtx compares byte[] using equals() method
[HIVE-9956] - use BigDecimal.valueOf instead of new in TestFileDump
[HIVE-9957] - Hive 1.1.0 not compatible with Hadoop 2.4.0
[HIVE-9961] - HookContext for view should return a table type of VIRTUAL_VIEW
[HIVE-9971] - Clean up operator class
[HIVE-9975] - Renaming a nonexisting partition should not throw out NullPointerException
[HIVE-9976] - Possible race condition in DynamicPartitionPruner for

New in Apache Hive 1.1.0 (May 19, 2015)

Sub-task:
[HIVE-7073] - Implement Binary in ParquetSerDe
[HIVE-8121] - Create micro-benchmarks for ParquetSerde and evaluate performance
[HIVE-8122] - Make use of SearchArgument classes for Parquet SERDE
[HIVE-8130] - Support Date in Avro
[HIVE-8131] - Support timestamp in Avro
[HIVE-8362] - Investigate flaky test parallel.q
[HIVE-8651] - CBO: sort column changed in infer_bucket_sort test
[HIVE-8707] - Fix ordering differences due to Java 8 HashMap function
[HIVE-8718] - Refactoring: move mapLocalWork field from MapWork to BaseWork
[HIVE-8773] - Fix TestWebHCatE2e#getStatus for Java8
[HIVE-8862] - Fix ordering diferences on TestParse tests due to Java8
[HIVE-8922] - CBO: assorted date and timestamp issues
[HIVE-8923] - HIVE-8512 needs to be fixed also for CBO
[HIVE-8936] - Add SORT_QUERY_RESULTS for join tests that do not guarantee order
[HIVE-8962] - Add SORT_QUERY_RESULTS for join tests that do not guarantee order #2
[HIVE-9030] - CBO: Plans with comparison of values with different types
[HIVE-9033] - Fix ordering differences due to Java8 (part 2)
[HIVE-9034] - CBO: type change in literal_ints.q
[HIVE-9035] - CBO: Disable PPD when functions are non-deterministic (ppd_random.q - non-deterministic udf rand() pushed above join)
[HIVE-9043] - HiveException: Conflict on row inspector for {table}
[HIVE-9066] - temporarily disable CBO for non-deterministic functions
[HIVE-9104] - windowing.q failed when mapred.reduce.tasks is set to larger than one
[HIVE-9109] - Add support for Java 8 specific q-test out files
[HIVE-9127] - Improve CombineHiveInputFormat.getSplit performance
[HIVE-9133] - CBO (Calcite Return Path): Refactor Semantic Analyzer to Move CBO code out
[HIVE-9153] - Perf enhancement on CombineHiveInputFormat and HiveInputFormat
[HIVE-9161] - Fix ordering differences on UDF functions due to Java8
[HIVE-9174] - Enable queuing of HCatalog notification events in metastore DB
[HIVE-9175] - Add alters to list of events handled by NotificationListener
[HIVE-9181] - Fix SkewJoinOptimizer related Java 8 ordering differences
[HIVE-9184] - Modify HCatClient to support new notification methods in HiveMetaStoreClient
[HIVE-9193] - Fix ordering differences due to Java 8 (Part 3)
[HIVE-9194] - Support select distinct *
[HIVE-9200] - CBO (Calcite Return Path): Inline Join, Properties
[HIVE-9206] - Fix Desc Formatted related Java 8 ordering differences
[HIVE-9211] - Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[HIVE-9222] - Fix ordering differences due to Java 8 (Part 4)
[HIVE-9224] - CBO (Calcite Return Path): Inline Table, Properties
[HIVE-9239] - Fix ordering differences due to Java 8 (Part 5)
[HIVE-9241] - Fix TestCliDriver.testCliDriver_subquery_multiinsert
[HIVE-9257] - Merge from spark to trunk January 2015
[HIVE-9259] - Fix ClassCastException when CBO is enabled for HOS [Spark Branch]
[HIVE-9264] - Merge encryption branch to trunk
[HIVE-9292] - CBO (Calcite Return Path): Inline GroupBy, Properties
[HIVE-9315] - CBO (Calcite Return Path): Inline FileSinkOperator, Properties
[HIVE-9321] - Notification message size can be arbitrarily long, DbNotificationListener limits to 1024
[HIVE-9352] - Merge from spark to trunk (follow-up of HIVE-9257)
[HIVE-9409] - Avoid ser/de loggers as logging framework can be incompatible on driver and workers
[HIVE-9410] - ClassNotFoundException occurs during hive query case execution with UDF defined [Spark Branch]
[HIVE-9425] - Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
[HIVE-9428] - LocalSparkJobStatus may return failed job as successful [Spark Branch]
[HIVE-9431] - CBO (Calcite Return Path): Removing AST from ParseContext
[HIVE-9434] - Shim the method Path.getPathWithoutSchemeAndAuthority
[HIVE-9444] - CBO (Calcite Return Path): Rewrite GlobalLimitOptimizer
[HIVE-9449] - Push YARN configuration to Spark while deply Spark on YARN[Spark Branch]
[HIVE-9450] - [Parquet] Check all data types work for Parquet in Group By operator
[HIVE-9477] - No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
[HIVE-9487] - Make Remote Spark Context secure [Spark Branch]
[HIVE-9493] - Failed job may not throw exceptions [Spark Branch]
Bug:
[HIVE-1344] - error in select disinct
[HIVE-1654] - select distinct should allow column name regex
[HIVE-1869] - TestMTQueries failing on jenkins
[HIVE-3781] - Index related events should be delivered to metastore event listener
[HIVE-4009] - CLI Tests fail randomly due to MapReduce LocalJobRunner race condition
[HIVE-5536] - Incorrect Operation Name is passed to hookcontext
[HIVE-5865] - AvroDeserializer incorrectly assumes keys to Maps will always be of type 'org.apache.avro.util.Utf8'
[HIVE-6165] - Unify HivePreparedStatement from jdbc:hive and jdbc:hive2
[HIVE-6308] - COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.
[HIVE-6421] - abs() should preserve precision/scale of decimal input
[HIVE-6623] - Add "owner" tag to ptest2 created instances
[HIVE-6683] - Beeline does not accept comments at end of line
[HIVE-6914] - parquet-hive cannot write nested map (map value is map)
[HIVE-7024] - Escape control characters for explain result
[HIVE-7069] - Zookeeper connection leak
[HIVE-7932] - It may cause NP exception when add accessed columns to ReadEntity
[HIVE-7951] - InputFormats implementing (Job)Configurable should not be cached
[HIVE-7997] - Potential null pointer reference in ObjectInspectorUtils#compareTypes()
[HIVE-8182] - beeline fails when executing multiple-line queries with trailing spaces
[HIVE-8257] - Accumulo introduces old hadoop-client dependency
[HIVE-8266] - create function using statement compilation should include resource URI entity
[HIVE-8284] - Equality comparison is done between two floating point variables in HiveRelMdUniqueKeys#getUniqueKeys()
[HIVE-8308] - Acid related table properties should be defined in one place and should be case insensitive
[HIVE-8317] - WebHCat pom should explicitly depend on jersey-core
[HIVE-8326] - Using DbTxnManager with concurrency off results in run time error
[HIVE-8330] - HiveResultSet.findColumn() parameters are case sensitive
[HIVE-8338] - Add ip and command to semantic analyzer hook context
[HIVE-8345] - q-test for Avro date support
[HIVE-8359] - Map containing null values are not correctly written in Parquet files
[HIVE-8381] - Update hive version on trunk to 0.15
[HIVE-8387] - add retry logic to ZooKeeperStorage in WebHCat
[HIVE-8448] - Union All might not work due to the type conversion issue
[HIVE-8450] - Create table like does not copy over table properties
[HIVE-8491] - Fix build name in ptest pre-commit message
[HIVE-8500] - beeline does not need to set hive.aux.jars.path
[HIVE-8512] - queries with star and gby produce incorrect results
[HIVE-8518] - Compile time skew join optimization returns duplicated results
[HIVE-8523] - Potential null dereference in DDLSemanticAnalyzer#addInputsOutputsAlterTable()
[HIVE-8556] - introduce overflow control and sanity check to BytesBytesMapJoin
[HIVE-8564] - DROP TABLE IF EXISTS throws exception if the table does not exist.
[HIVE-8565] - beeline may go into an infinite loop when using EOF
[HIVE-8576] - Guaranteed NPE in StatsRulesProcFactory
[HIVE-8594] - Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
[HIVE-8600] - Add option to log explain output for query
[HIVE-8610] - Compile time skew join optimization doesn't work with auto map join
[HIVE-8611] - grant/revoke syntax should support additional objects for authorization plugins
[HIVE-8612] - Support metadata result filter hooks
[HIVE-8613] - percentile_approx raise a comparator error
[HIVE-8627] - Compute stats on a table from impala caused the table to be corrupted
[HIVE-8634] - HiveServer2 fair scheduler queue mapping doesn't handle the secondary groups rules correctly
[HIVE-8636] - CBO: split cbo_correctness test
[HIVE-8666] - hive.metastore.server.max.threads default is too high
[HIVE-8680] - Set Max Message for Binary Thrift endpoints
[HIVE-8693] - Separate out fair scheduler dependency from hadoop 0.23 shim
[HIVE-8708] - Add query id to explain log option
[HIVE-8720] - Update orc_merge tests to make it consistent across OS'es
[HIVE-8728] - Fix ptf.q determinism
[HIVE-8730] - schemaTool failure when date partition has non-date value
[HIVE-8736] - add ordering to cbo_correctness to make result consistent
[HIVE-8757] - YARN dep in scheduler shim should be optional
[HIVE-8762] - HiveMetaStore.BooleanPointer should be replaced with an AtomicBoolean
[HIVE-8791] - Hive permission inheritance throws exception S3
[HIVE-8796] - TestCliDriver acid tests with decimal needs benchmark to be updated
[HIVE-8797] - Simultaneous dynamic inserts can result in "partition already exists" error
[HIVE-8803] - DESC SCHEMA is not working
[HIVE-8808] - HiveInputFormat caching cannot work with all input formats
[HIVE-8812] - TestMinimrCliDriver failure if run in the same command as TestHBaseNegativeCliDriver
[HIVE-8825] - SQLCompletor catches Throwable and ignores it
[HIVE-8847] - Fix bugs in jenkins scripts
[HIVE-8848] - data loading from text files or text file processing doesn't handle nulls correctly
[HIVE-8850] - ObjectStore:: rollbackTransaction() needs to be looked into further.
[HIVE-8863] - Cannot drop table with uppercase name after "compute statistics for columns"
[HIVE-8869] - RowSchema not updated for some ops when columns are pruned
[HIVE-8872] - Hive view of HBase range scan intermittently returns incorrect data.
[HIVE-8874] - Error Accessing HBase from Hive via Oozie on Kerberos 5.0.1 cluster
[HIVE-8875] - hive.optimize.sort.dynamic.partition should be turned off for ACID
[HIVE-8877] - improve context logging during job submission via WebHCat
[HIVE-8879] - Upgrade derby version to address race candition
[HIVE-8881] - Receiving json "{"error":"Could not find job job_1415748506143_0002"}" when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
[HIVE-8889] - JDBC Driver ResultSet.getXXXXXX(String columnLabel) methods Broken
[HIVE-8891] - Another possible cause to NucleusObjectNotFoundException from drops/rollback
[HIVE-8893] - Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[HIVE-8901] - increase retry attempt, interval on metastore database errors
[HIVE-8909] - Hive doesn't correctly read Parquet nested types
[HIVE-8914] - HDFSCleanup thread holds reference to FileSystem
[HIVE-8916] - Handle user@domain username under LDAP authentication
[HIVE-8917] - HIVE-5679 adds two thread safety problems
[HIVE-8926] - Projections that only swap input columns are identified incorrectly as identity projections
[HIVE-8938] - Compiler should save the transform URI as input entity
[HIVE-8944] - TestCompactor fails with IncompatibleClassChangeError
[HIVE-8948] - TestStreaming is flaky
[HIVE-8964] - Some TestMiniTezCliDriver tests taking two hours
[HIVE-8965] - Enhance PTest to kill all processes between tests and to report when a TEST*.xml file is not generated
[HIVE-8967] - Fix bucketmapjoin7.q determinism
[HIVE-8975] - Possible performance regression on bucket_map_join_tez2.q
[HIVE-8978] - Fix test determinism issue for qfile: smb_mapjoin_1.q etc
[HIVE-8990] - mapjoin_mapjoin.q is failing on Tez (missed golden file update)
[HIVE-9001] - Ship with log4j.properties file that has a reliable time based rolling policy
[HIVE-9006] - hiveserver thrift api version is still 6
[HIVE-9011] - Fix parquet_join.q determinism
[HIVE-9024] - NullPointerException when starting webhcat server if templeton.hive.properties is not set
[HIVE-9032] - Help for orcfiledump script does not reflect new options
[HIVE-9048] - Hive build failed on hadoop-1 after HIVE-8828.
[HIVE-9055] - Tez: union all followed by group by followed by another union all gives error
[HIVE-9060] - Fix child operator references after NonBlockingOpDeDupProc
[HIVE-9077] - Set completer in CliDriver is not working
[HIVE-9096] - GenericUDF may be left unclosed in PartitionPrune#visitCall()
[HIVE-9113] - Explain on query failed with NPE
[HIVE-9120] - Hive Query log does not work when hive.exec.parallel is true
[HIVE-9122] - Need to remove additional references to hive-shims-common-secure, hive-shims-0.20
[HIVE-9129] - Migrate to newer Calcite snapshot, where ByteString is now in org.apache.calcite.avatica.util
[HIVE-9130] - vector_partition_diff_num_cols result is not updated after CBO upgrade
[HIVE-9131] - MiniTez optimize_nullscan test is unstable
[HIVE-9149] - Add unit test to test implicit conversion during dynamic partitioning/distribute by
[HIVE-9150] - Unrelated types are compared in GenTezWork#getFollowingWorkIndex()
[HIVE-9154] - Cache pathToPartitionInfo in context aware record reader
[HIVE-9177] - Fix child operator references after NonBlockingOpDeDupProc (II)
[HIVE-9195] - CBO changes constant to column type
[HIVE-9197] - fix lvj_mapjoin.q diff in trunk
[HIVE-9199] - Excessive exclusive lock used in some DDLs with DummyTxnManager
[HIVE-9203] - CREATE TEMPORARY FUNCTION hangs trying to acquire lock
[HIVE-9215] - Some mapjoin queries broken with IdentityProjectRemover with PPD
[HIVE-9221] - Remove deprecation warning for hive.metastore.local
[HIVE-9242] - Many places in CBO code eat exceptions
[HIVE-9243] - Static Map in IOContext is not thread safe
[HIVE-9255] - Fastpath for limited fetches from unpartitioned tables
[HIVE-9296] - Need to add schema upgrade changes for queueing events in the database
[HIVE-9299] - Reuse Configuration in AvroSerdeUtils
[HIVE-9300] - Make TCompactProtocol configurable
[HIVE-9301] - Potential null dereference in MoveTask#createTargetPath()
[HIVE-9309] - schematool fails on Postgres 8.1
[HIVE-9310] - CLI JLine does not flush history back to ~/.hivehistory
[HIVE-9316] - TestSqoop tests in WebHCat testsuite hardcode libdir path to hdfs
[HIVE-9330] - DummyTxnManager will throw NPE if WriteEntity writeType has not been set
[HIVE-9331] - get rid of pre-optimized-hashtable memory optimizations
[HIVE-9344] - Fix flaky test optimize_nullscan
[HIVE-9347] - Bug with max() together with rank() and grouping sets
[HIVE-9351] - Running Hive Jobs with Tez cause templeton to never report percent complete
[HIVE-9353] - make TABLE keyword optional in INSERT INTO TABLE foo...
[HIVE-9366] - wrong date in description annotation in date_add() and date_sub() udf
[HIVE-9369] - fix arguments length checking in Upper and Lower UDF
[HIVE-9377] - UDF in_file() in WHERE predicate causes NPE.
[HIVE-9381] - HCatalog hardcodes maximum append limit to 1000.
[HIVE-9382] - Query got rerun with Global Limit optimization on and Fetch optimization off
[HIVE-9386] - FileNotFoundException when using in_file()
[HIVE-9393] - reduce noisy log level of ColumnarSerDe.java:116 from INFO to DEBUG
[HIVE-9396] - date_add()/date_sub() should allow tinyint/smallint/bigint arguments in addition to int
[HIVE-9414] - Fixup post HIVE-9264 - Merge encryption branch to trunk
[HIVE-9437] - Beeline does not add any existing HADOOP_CLASSPATH
[HIVE-9440] - Folders may not be pruned for Hadoop 2
[HIVE-9441] - Remove call to deprecated Calcite method
[HIVE-9443] - ORC PPD - fix fuzzy case evaluation of IS_NULL
[HIVE-9445] - Revert HIVE-5700 - enforce single date format for partition column storage
[HIVE-9446] - JDBC DatabaseMetadata.getColumns() does not work for temporary tables
[HIVE-9448] - Merge spark to trunk 1/23/15
[HIVE-9454] - Test failures due to new Calcite version
[HIVE-9462] - HIVE-8577 - breaks type evolution
[HIVE-9475] - HiveMetastoreClient.tableExists does not work
[HIVE-9476] - Beeline fails to start on trunk
[HIVE-9502] - Parquet cannot read Map types from files written with Hive

New in Apache Hive 1.0.0 (Feb 5, 2015)

Bug:
[HIVE-5631] - Index creation on a skew table fails
[HIVE-5664] - Drop cascade database fails when the db has any tables with indexes
[HIVE-6198] - ORC file and struct column names are case sensitive
[HIVE-6468] - HS2 & Metastore using SASL out of memory error when curl sends a get request
[HIVE-7270] - SerDe Properties are not considered by show create table Command
[HIVE-8099] - IN operator for partition column fails when the partition column type is DATE
[HIVE-8295] - Add batch retrieve partition objects for metastore direct sql
[HIVE-8374] - schematool fails on Postgres versions < 9.2
[HIVE-8485] - HMS on Oracle incompatibility
[HIVE-8706] - Table statistic collection on counter failed due to table name character case.
[HIVE-8715] - Hive 14 upgrade scripts can fail for statistics if database was created using auto-create
[HIVE-8739] - handle Derby and Oracle errors with joins and filters in Direct SQL in a invalid-DB-specific path
[HIVE-8784] - Querying partition does not work with JDO enabled against PostgreSQL
[HIVE-8805] - CBO skipped due to SemanticException: Line 0:-1 Both left and right aliases encountered in JOIN 'avg_cs_ext_discount_amt'
[HIVE-8807] - Obsolete default values in webhcat-default.xml
[HIVE-8811] - Dynamic partition pruning can result in NPE during query compilation
[HIVE-8827] - Remove SSLv2Hello from list of disabled protocols
[HIVE-8830] - hcatalog process don't exit because of non daemon thread
[HIVE-8845] - Switch to Tez 0.5.2
[HIVE-8866] - Vectorization on partitioned table throws ArrayIndexOutOfBoundsException when partitions are not of same #of columns
[HIVE-8870] - errors when selecting a struct field within an array from ORC based tables
[HIVE-8873] - Switch to calcite 0.9.2
[HIVE-8876] - incorrect upgrade script for Oracle (13->14)
[HIVE-8880] - non-synchronized access to split list in OrcInputFormat
[HIVE-8886] - Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup
[HIVE-8888] - Mapjoin with LateralViewJoin generates wrong plan in Tez
[HIVE-8947] - HIVE-8876 also affects Postgres < 9.2
[HIVE-8966] - Delta files created by hive hcatalog streaming cannot be compacted
[HIVE-9003] - Vectorized IF expr broken for the scalar and scalar case
[HIVE-9025] - join38.q (without map join) produces incorrect result when testing with multiple reducers
[HIVE-9038] - Join tests fail on Tez
[HIVE-9051] - TezJobMonitor in-place updates logs too often to logfile
[HIVE-9053] - select constant in union all followed by group by gives wrong result
[HIVE-9067] - OrcFileMergeOperator may create merge file that does not match properties of input files
[HIVE-9090] - Rename "Tez File Merge Work" to smaller name
[HIVE-9108] - Fix for HIVE-8735 is incorrect (stats with long paths)
[HIVE-9111] - Potential NPE in OrcStruct for list and map types
[HIVE-9112] - Query may generate different results depending on the number of reducers
[HIVE-9114] - union all query in cbo test has undefined ordering
[HIVE-9126] - Backport HIVE-8827 (Remove SSLv2Hello from list of disabled protocols) to 0.14 branch
[HIVE-9141] - HiveOnTez: mix of union all, distinct, group by generates error
[HIVE-9155] - HIVE_LOCKS uses int instead of bigint hive-txn-schema-0.14.0.mssql.sql
[HIVE-9162] - stats19 test is environment-dependant
[HIVE-9166] - Place an upper bound for SARG CNF conversion
[HIVE-9168] - Vectorized Coalesce for strings is broken
[HIVE-9205] - Change default tez install directory to use /tmp instead of /user and create the directory if it does not exist
[HIVE-9234] - HiveServer2 leaks FileSystem objects in FileSystem.CACHE
[HIVE-9249] - java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.HiveVarcharWritable cannot be cast to org.apache.hadoop.hive.common.type.HiveVarchar when joining tables
[HIVE-9278] - Cached expression feature broken in one case
[HIVE-9317] - move Microsoft copyright to NOTICE file
[HIVE-9359] - Export of a large table causes OOM in Metastore and Client
[HIVE-9361] - Intermittent NPE in SessionHiveMetaStoreClient.alterTempTable
[HIVE-9390] - Enhance retry logic wrt DB access in TxnHandler
[HIVE-9401] - Backport: Fastpath for limited fetches from unpartitioned tables
[HIVE-9404] - NPE in org.apache.hadoop.hive.metastore.txn.TxnHandler.determineDatabaseProduct()
[HIVE-9436] - RetryingMetaStoreClient does not retry JDOExceptions
[HIVE-9473] - sql std auth should disallow built-in udfs that allow any java methods to be called
[HIVE-9514] - schematool is broken in hive 1.0.0
Improvement:
[HIVE-3280] - Make HiveMetaStoreClient a public API
[HIVE-8933] - Check release builds for SNAPSHOT dependencies
Task:
[HIVE-6977] - Delete Hiveserver1

New in Apache Hive 0.14.0 (Feb 5, 2015)

New in Apache Hive 0.12.0 (Feb 5, 2015)

Sub-task:
[HIVE-2304] - Support PreparedStatement.setObject
[HIVE-4055] - add Date data type
[HIVE-4266] - Refactor HCatalog code to org.apache.hive.hcatalog
[HIVE-4324] - ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[HIVE-4355] - HCatalog test TestPigHCatUtil might fail on JDK7
[HIVE-4460] - Publish HCatalog artifacts for Hadoop 2.x
[HIVE-4478] - In ORC, add boolean noNulls flag to column stripe metadata
[HIVE-4626] - join_vc.q is not deterministic
[HIVE-4646] - skewjoin.q is failing in hadoop2
[HIVE-4690] - stats_partscan_1.q makes different result with different hadhoop.mr.rev
[HIVE-4708] - Fix TestCliDriver.combine2.q on 0.23
[HIVE-4711] - Fix TestCliDriver.list_bucket_query_oneskew_{1,2,3}.q on 0.23
[HIVE-4712] - Fix TestCliDriver.truncate_* on 0.23
[HIVE-4713] - Fix TestCliDriver.skewjoin_union_remove_{1,2}.q on 0.23
[HIVE-4715] - Fix TestCliDriver.{recursive_dir.q,sample_islocalmode_hook.q,input12.q,input39.q,auto_join14.q} on 0.23
[HIVE-4717] - Fix non-deterministic TestCliDriver on 0.23
[HIVE-4721] - Fix TestCliDriver.ptf_npath.q on 0.23
[HIVE-4746] - Fix TestCliDriver.list_bucket_dml_{2,4,5,9,12,13}.q on 0.23
[HIVE-4750] - Fix TestCliDriver.list_bucket_dml_{6,7,8}.q on 0.23
[HIVE-4756] - Upgrade Hadoop 0.23 profile to 2.0.5-alpha
[HIVE-4761] - ZooKeeperHiveLockManage.unlockPrimitive has race condition with threads
[HIVE-4762] - HMS cannot handle concurrent requests
[HIVE-4763] - add support for thrift over http transport in HS2
[HIVE-4767] - ObjectStore.getPMF has concurrency problems
[HIVE-4871] - Apache builds fail with Target "make-pom" does not exist in the project "hcatalog".
[HIVE-4894] - Update maven coordinates of HCatalog artifacts
[HIVE-4895] - Move all HCatalog classes to org.apache.hive.hcatalog
[HIVE-4896] - create binary backwards compatibility layer hcatalog 0.12 and 0.11
[HIVE-4908] - rename templeton to webhcat?
[HIVE-4940] - udaf_percentile_approx.q is not deterministic
[HIVE-4980] - Fix the compiling error in TestHadoop20SAuthBridge
[HIVE-5013] - [HCatalog] Create hcat.py, hcat_server.py to make HCatalog work on Windows
[HIVE-5014] - [HCatalog] Fix HCatalog build issue on Windows
[HIVE-5015] - [HCatalog] Fix HCatalog unit tests on Windows
[HIVE-5028] - Some tests with fail OutOfMemoryError PermGen Space on Hadoop2
[HIVE-5035] - [WebHCat] Hardening parameters for Windows
[HIVE-5036] - [WebHCat] Add cmd script for WebHCat
[HIVE-5063] - Fix some non-deterministic or not-updated tests
[HIVE-5066] - [WebHCat] Other code fixes for Windows
[HIVE-5069] - Tests on list bucketing are failing again in hadoop2
[HIVE-5078] - [WebHCat] Fix e2e tests on Windows plus test cases for new features
[HIVE-5163] - refactor org.apache.hadoop.mapred.HCatMapRedUtil
[HIVE-5213] - remove hcatalog/shims directory
[HIVE-5233] - move hbase storage handler to org.apache.hcatalog package
[HIVE-5236] - Change HCatalog spacing from 4 spaces to 2
[HIVE-5260] - Introduce HivePassThroughOutputFormat that allows Hive to use general purpose OutputFormats instead of HiveOutputFormats in StorageHandlers
[HIVE-5261] - Make the Hive HBase storage handler work from HCatalog, and use HiveStorageHandlers instead of HCatStorageHandlers
Bug:
[HIVE-2015] - Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages
[HIVE-2379] - Hive/HBase integration could be improved
[HIVE-2473] - Hive throws an NPE when $HADOOP_HOME points to a tarball install directory that contains a build/ subdirectory.
[HIVE-2702] - Enhance listPartitionsByFilter to add support for integral types both for equality and non-equality
[HIVE-2905] - Desc table can't show non-ascii comments
[HIVE-3189] - cast ( as bigint) returning null values
[HIVE-3191] - timestamp - timestamp causes null pointer exception
[HIVE-3253] - ArrayIndexOutOfBounds exception for deeply nested structs
[HIVE-3256] - Update asm version in Hive
[HIVE-3264] - Add support for binary dataype to AvroSerde
[HIVE-3475] - INLINE UDTF doesn't convert types properly
[HIVE-3562] - Some limit can be pushed down to map stage
[HIVE-3588] - Get Hive to work with hbase 94
[HIVE-3632] - Upgrade datanucleus to support JDK7
[HIVE-3691] - TestDynamicSerDe failed with IBM JDK
[HIVE-3756] - "LOAD DATA" does not honor permission inheritence
[HIVE-3772] - Fix a concurrency bug in LazyBinaryUtils due to a static field
[HIVE-3810] - HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile
[HIVE-3846] - alter view rename NPEs with authorization on.
[HIVE-3891] - physical optimizer changes for auto sort-merge join
[HIVE-3926] - PPD on virtual column of partitioned table is not working
[HIVE-3953] - Reading of partitioned Avro data fails because of missing properties
[HIVE-3957] - Add pseudo-BNF grammar for RCFile to Javadoc
[HIVE-3978] - HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
[HIVE-4003] - NullPointerException in exec.Utilities
[HIVE-4051] - Hive's metastore suffers from 1+N queries when querying partitions & is slow
[HIVE-4057] - LazyHBaseRow may return cache data if the field is null and make the result wrong
[HIVE-4089] - javax.jdo : jdo2-api dependency not in Maven Central
[HIVE-4106] - SMB joins fail in multi-way joins
[HIVE-4171] - Current database in metastore.Hive is not consistent with SessionState
[HIVE-4181] - Star argument without table alias for UDTF is not working
[HIVE-4194] - JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
[HIVE-4214] - OVER accepts general expression instead of just function
[HIVE-4222] - Timestamp type constants cannot be deserialized in JDK 1.6 or less
[HIVE-4233] - The TGT gotten from class 'CLIService' should be renewed on time
[HIVE-4251] - Indices can't be built on tables whose schema info comes from SerDe
[HIVE-4290] - Build profiles: Partial builds for quicker dev
[HIVE-4295] - Lateral view makes invalid result if CP is disabled
[HIVE-4299] - exported metadata by HIVE-3068 cannot be imported because of wrong file name
[HIVE-4300] - ant thriftif generated code that is checkedin is not up-to-date
[HIVE-4322] - SkewedInfo in Metastore Thrift API cannot be deserialized in Python
[HIVE-4339] - build fails after branch (hcatalog version not updated)
[HIVE-4343] - HS2 with kerberos- local task for map join fails
[HIVE-4344] - CREATE VIEW fails when redundant casts are rewritten
[HIVE-4347] - Hcatalog build fail on Windows because javadoc command exceed length limit
[HIVE-4348] - Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[HIVE-4350] - support AS keyword for table alias
[HIVE-4351] - Thrift code generation fails due to hcatalog
[HIVE-4364] - beeline always exits with 0 status, should exit with non-zero status on error
[HIVE-4369] - Many new failures on hadoop 2
[HIVE-4375] - Single sourced multi insert consists of native and non-native table mixed throws NPE
[HIVE-4377] - Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
[HIVE-4392] - Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[HIVE-4403] - Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[HIVE-4406] - Missing "/" or "/" in hs2 jdbc uri switches mode to embedded mode
[HIVE-4407] - TestHCatStorer.testStoreFuncAllSimpleTypes fails because of null case difference
[HIVE-4418] - TestNegativeCliDriver failure message if cmd succeeds is misleading
[HIVE-4421] - Improve memory usage by ORC dictionaries
[HIVE-4422] - Test output need to be updated for Windows only unit test in TestCliDriver
[HIVE-4424] - MetaStoreUtils.java.orig checked in mistakenly by HIVE-4409
[HIVE-4428] - Misspelling in describe extended output
[HIVE-4430] - Semantic analysis fails in presence of certain literals in on clause
[HIVE-4433] - Fix C++ Thrift bindings broken in HIVE-4322
[HIVE-4435] - Column stats: Distinct value estimator should use hash functions that are pairwise independent
[HIVE-4436] - hive.exec.parallel=true doesn't work on hadoop-2
[HIVE-4438] - Remove unused join configuration parameter: hive.mapjoin.size.key
[HIVE-4439] - Remove unused join configuration parameter: hive.mapjoin.cache.numrows
[HIVE-4440] - SMB Operator spills to disk like it's 1999
[HIVE-4441] - [HCatalog] WebHCat does not honor user home directory
[HIVE-4442] - [HCatalog] WebHCat should not override user.name parameter for Queue call
[HIVE-4465] - webhcat e2e tests succeed regardless of exitvalue
[HIVE-4466] - Fix continue.on.failure in unit tests to -well- continue on failure in unit tests
[HIVE-4471] - Build fails with hcatalog checkstyle error
[HIVE-4474] - Column access not tracked properly for partitioned tables
[HIVE-4475] - Switch RCFile default to LazyBinaryColumnarSerDe
[HIVE-4486] - FetchOperator slows down SMB map joins by 50% when there are many partitions
[HIVE-4487] - Hive does not set explicit permissions on hive.exec.scratchdir
[HIVE-4489] - beeline always return the same error message twice
[HIVE-4492] - Revert HIVE-4322
[HIVE-4496] - JDBC2 won't compile with JDK7
[HIVE-4497] - beeline module tests don't get run by default
[HIVE-4502] - NPE - subquery smb joins fails
[HIVE-4510] - HS2 doesn't nest exceptions properly (fun debug times)
[HIVE-4513] - disable hivehistory logs by default
[HIVE-4516] - Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
[HIVE-4521] - Auto join conversion fails in certain cases (empty tables, empty partitions, no partitions)
[HIVE-4525] - Support timestamps earlier than 1970 and later than 2038
[HIVE-4535] - hive build fails with hadoop 0.20
[HIVE-4540] - JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true
[HIVE-4542] - TestJdbcDriver2.testMetaDataGetSchemas fails because of unexpected database
[HIVE-4543] - Broken link in HCat 0.5 doc (Reader and Writer Interfaces)
[HIVE-4546] - Hive CLI leaves behind the per session resource directory on non-interactive invocation
[HIVE-4547] - A complex create view statement fails with new Antlr 3.4
[HIVE-4550] - local_mapred_error_cache fails on some hadoop versions
[HIVE-4554] - Failed to create a table from existing file if file path has spaces
[HIVE-4559] - hcatalog/webhcat scripts in tar.gz don't have execute permissions set
[HIVE-4562] - HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[HIVE-4566] - NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established
[HIVE-4572] - ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap
[HIVE-4573] - Support alternate table types for HiveServer2
[HIVE-4578] - Changes to Pig's test harness broke HCat e2e tests
[HIVE-4580] - Change DDLTask to report errors using canonical error messages rather than http status codes
[HIVE-4581] - HCat e2e tests broken by changes to Hive's describe table formatting
[HIVE-4585] - Remove unused MR Temp file localization from Tasks
[HIVE-4586] - [HCatalog] WebHCat should return 404 error for undefined resource
[HIVE-4589] - Hive Load command failed when inpath contains space or any restricted characters
[HIVE-4591] - Making changes to webhcat-site.xml have no effect
[HIVE-4593] - ErrorMsg has several messages that reuse the same error code
[HIVE-4611] - SMB joins fail based on bigtable selection policy.
[HIVE-4615] - Invalid column names allowed when created dynamically by a SerDe
[HIVE-4618] - show create table creating unusable DDL when field delimiter is \001
[HIVE-4619] - Hive 0.11.0 is not working with pre-cdh3u6 and hadoop-0.23
[HIVE-4638] - Thread local PerfLog can get shared by multiple hiveserver2 sessions
[HIVE-4650] - Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x
[HIVE-4657] - HCatalog checkstyle violation after HIVE-2670
[HIVE-4677] - [HCatalog] WebHCat e2e tests fail on Hadoop 2
[HIVE-4679] - WebHCat can deadlock Hadoop if the number of concurrently running tasks if higher or equal than the number of mappers
[HIVE-4683] - fix coverage org.apache.hadoop.hive.cli
[HIVE-4689] - For outerjoins, joinEmitInterval might make wrong result
[HIVE-4691] - orc_createas1.q has minor inconsistency
[HIVE-4692] - Constant agg parameters will be replaced by ExprNodeColumnDesc with single-sourced multi-gby cases
[HIVE-4696] - WebHCat e2e test framework is missing files and instructions
[HIVE-4707] - Support configurable domain name for HiveServer2 LDAP authentication using Active Directory
[HIVE-4710] - ant maven-build -Dmvn.publish.repo=local fails
[HIVE-4724] - ORC readers should have a better error detection for non-ORC files
[HIVE-4730] - Join on more than 2^31 records on single reducer failed (wrong results)
[HIVE-4733] - HiveLockObjectData is not compared properly
[HIVE-4740] - HIVE-2379 is missing hbase.jar itself
[HIVE-4742] - A useless CAST makes Hive fail to create a VIEW based on an UNION
[HIVE-4748] - Fix TempletonUtilsTest failure on Windows
[HIVE-4757] - LazyTimestamp goes into irretrievable NULL mode once inited with NULL once
[HIVE-4781] - LEFT SEMI JOIN generates wrong results when the number of rows belonging to a single key of the right table exceed hive.join.emit.interval
[HIVE-4784] - ant testreport doesn't include any HCatalog tests
[HIVE-4785] - Implement isCaseSensitive for Hive JDBC driver
[HIVE-4789] - FetchOperator fails on partitioned Avro data
[HIVE-4798] - NPE when we call isSame from an instance of ExprNodeConstantDesc with null value
[HIVE-4802] - Fix url check for missing "/" or "/ after hostname in jdb uri
[HIVE-4804] - parallel order by fails for small datasets
[HIVE-4807] - Hive metastore hangs
[HIVE-4808] - WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly
[HIVE-4810] - Refactor exec package
[HIVE-4811] - (Slightly) break up the SemanticAnalyzer monstrosity
[HIVE-4812] - Logical explain plan
[HIVE-4814] - Adjust WebHCat e2e tests until HIVE-4703 is addressed
[HIVE-4818] - SequenceId in operator is not thread safe
[HIVE-4820] - webhcat_config.sh should set default values for HIVE_HOME and HCAT_PREFIX that work with default build tree structure
[HIVE-4829] - TestWebHCatE2e checkstyle violation causes all tests to fail
[HIVE-4830] - Test clientnegative/nested_complex_neg.q got broken due to 4580
[HIVE-4833] - Fix eclipse template classpath to include the correct jdo lib
[HIVE-4836] - make checkstyle ignore IntelliJ files and templeton e2e files
[HIVE-4838] - Refactor MapJoin HashMap code to improve testability and readability
[HIVE-4839] - build-common.xml has
[HIVE-4840] - Fix eclipse template classpath to include the BoneCP lib
[HIVE-4843] - Refactoring MapRedTask and ExecDriver for better re-usability (for tez) and readability
[HIVE-4845] - Correctness issue with MapJoins using the null safe operator
[HIVE-4852] - -Dbuild.profile=core fails
[HIVE-4853] - junit timeout needs to be updated
[HIVE-4854] - testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[HIVE-4863] - Fix parallel order by on hadoop2
[HIVE-4865] - HiveLockObjects: Unlocking retries/times out when query contains ":"
[HIVE-4869] - Clean up HCatalog build post Hive integration
[HIVE-4870] - Explain Extended to show partition info for Fetch Task
[HIVE-4875] - hive config template is not parse-able due to angle brackets in description
[HIVE-4876] - Beeling help text do not contain -f and -e parameters
[HIVE-4878] - With Dynamic partitioning, some queries would scan default partition even if query is not using it.
[HIVE-4883] - TestHadoop20SAuthBridge tests fail sometimes because of race condition
[HIVE-4891] - Distinct includes duplicate records
[HIVE-4892] - PTest2 cleanup after merge
[HIVE-4893] - [WebHCat] HTTP 500 errors should be mapped to 400 for bad request
[HIVE-4899] - Hive returns non-meanful error message for ill-formed fs.default.name
[HIVE-4900] - Fix the mismatched column names in package.jdo
[HIVE-4915] - unit tests fail on windows because of difference in input file size
[HIVE-4927] - When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed
[HIVE-4928] - Date literals do not work properly in partition spec clause
[HIVE-4929] - the type of all numeric constants is changed to double in the plan
[HIVE-4930] - Classes of metastore should not be included MR-task
[HIVE-4932] - PTFOperator fails resetting PTFPersistence
[HIVE-4935] - Potential NPE in MetadataOnlyOptimizer
[HIVE-4942] - Fix eclipse template files to use correct datanucleus libs
[HIVE-4951] - combine2_win.q.out needs update for HIVE-3253 (increasing nesting levels)
[HIVE-4952] - When hive.join.emit.interval is small, queries optimized by Correlation Optimizer may generate wrong results
[HIVE-4955] - serde_user_properties.q.out needs to be updated
[HIVE-4962] - fix eclipse template broken by HIVE-3256
[HIVE-4964] - Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[HIVE-4968] - When deduplicating multiple SelectOperators, we should update RowResolver accordinly
[HIVE-4970] - BinaryConverter does not respect nulls
[HIVE-4972] - update code generated by thrift for DemuxOperator and MuxOperator
[HIVE-4987] - Javadoc can generate argument list too long error
[HIVE-4990] - ORC seeks fails with non-zero offset or column projection
[HIVE-4991] - hive build with 0.20 is broken
[HIVE-4995] - select * may incorrectly return empty fields with hbase-handler
[HIVE-4998] - support jdbc documented table types in default configuration
[HIVE-5010] - HCatalog maven integration doesn't override mvn.local.repo in two locations
[HIVE-5011] - Dynamic partitioning in HCatalog broken on external tables
[HIVE-5012] - [HCatalog] Make HCatalog work on Windows
[HIVE-5017] - DBTokenStore gives compiler warnings
[HIVE-5023] - Hive get wrong result when partition has the same path but different schema or authority
[HIVE-5026] - HIVE-3926 is committed in the state of not rebased to trunk
[HIVE-5034] - [WebHCat] Make WebHCat work for Windows
[HIVE-5046] - Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[HIVE-5047] - Hive client filters partitions incorrectly via pushdown in certain cases involving "or"
[HIVE-5048] - StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.
[HIVE-5049] - Create an ORC test case that has a 0.11 ORC file
[HIVE-5051] - StorageBasedAuthorizationProvider masks lower level exception with IllegalStateException
[HIVE-5055] - SessionState temp file gets created in history file directory
[HIVE-5056] - MapJoinProcessor ignores order of values in removing RS
[HIVE-5060] - JDBC driver assumes executeStatement is synchronous
[HIVE-5061] - Row sampling throws NPE when used in sub-query
[HIVE-5075] - bug in ExprProcFactory.genPruner
[HIVE-5079] - Make Hive compile under Windows
[HIVE-5084] - Fix newline.q on Windows
[HIVE-5085] - Hive Metatool errors out if HIVE_OPTS is set
[HIVE-5087] - Rename npath UDF to matchpath
[HIVE-5089] - Non query PreparedStatements are always failing on remote HiveServer2
[HIVE-5091] - ORC files should have an option to pad stripes to the HDFS block boundaries
[HIVE-5100] - RCFile::sync(long) missing 1 byte in System.arraycopy()
[HIVE-5104] - HCatStorer fails to store boolean type
[HIVE-5105] - HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap
[HIVE-5106] - HCatFieldSchema overrides equals() but not hashCode()
[HIVE-5120] - document what hive.server2.thrift.sasl.qop values mean in hive-default.xml.template
[HIVE-5122] - Add partition for multiple partition ignores locations for non-first partitions
[HIVE-5123] - group by on a same key producing wrong result
[HIVE-5127] - Upgrade xerces and xalan for WebHCat
[HIVE-5128] - Direct SQL for view is failing
[HIVE-5129] - Multiple table insert fails on count(distinct)
[HIVE-5131] - JDBC client's hive variables are not passed to HS2
[HIVE-5137] - A Hive SQL query should not return a ResultSet when the underlying plan does not include a FetchTask
[HIVE-5144] - HashTableSink allocates empty new Object[] arrays & OOMs - use a static emptyRow instead
[HIVE-5145] - Fix TestCliDriver.list_bucket_query_multiskew_2.q on hadoop 0.23
[HIVE-5149] - ReduceSinkDeDuplication can pick the wrong partitioning columns
[HIVE-5156] - HiveServer2 jdbc ResultSet.close should free up resources on server side
[HIVE-5161] - Additional SerDe support for varchar type
[HIVE-5167] - webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh
[HIVE-5196] - ThriftCLIService.java uses stderr to print the stack trace, it should use the logger instead.
[HIVE-5198] - WebHCat returns exitcode 143 (w/o an explanation)
[HIVE-5199] - Custom SerDe containing a nonSettable complex data type row object inspector throws cast exception with HIVE 0.11
[HIVE-5203] - FunctionRegistry.getMethodInternal() should prefer method arguments with closer affinity to the original argument types
[HIVE-5210] - WebHCatJTShim implementations are missing Apache license headers
[HIVE-5239] - LazyDate goes into irretrievable NULL mode once inited with NULL once
[HIVE-5241] - Default log4j log level for WebHCat should be INFO not DEBUG
[HIVE-5246] - Local task for map join submitted via oozie job fails on a secure HDFS
[HIVE-5255] - Missing metastore schema files for version 0.11
[HIVE-5265] - Direct SQL fallback broken on Postgres
[HIVE-5274] - HCatalog package renaming backward compatibility follow-up
[HIVE-5285] - Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
[HIVE-5292] - Join on decimal columns fails to return rows
[HIVE-5296] - Memory leak: OOM Error after multiple open/closed JDBC connections.
[HIVE-5297] - Hive does not honor type for partition columns
[HIVE-5301] - Add a schema tool for offline metastore schema upgrade
[HIVE-5322] - FsPermission is initialized incorrectly in HIVE 5513
[HIVE-5329] - Date and timestamp type converts invalid strings to '1970-01-01'
[HIVE-5337] - org.apache.hcatalog.common.HCatUtil is used by org.apache.hive.hcatalog.templeton.tool
[HIVE-5352] - cast('1.0' as int) returns null
[HIVE-5357] - ReduceSinkDeDuplication optimizer pick the wrong keys in pRS-cGBYm-cRS-cGBYr scenario when there are distinct keys in child GBY
[HIVE-5362] - TestHCatHBaseInputFormat has a bug which will not allow it to run on JDK7 and RHEL 6
[HIVE-5364] - NPE on some queries from partitioned orc table
[HIVE-5374] - hive-schema-0.13.0.postgres.sql doesn't work
[HIVE-5375] - Bug in Hive-0.12 branch with parameterized types due to merge conflict with HIVE-5199
[HIVE-5394] - ObjectInspectorConverters.getConvertedOI() does not return the correct object inspector for primitive type.
[HIVE-5401] - Array Out Of Bounds in OrcRecordReader
[HIVE-5402] - StorageBasedAuthorizationProvider is not correctly able to determine that it is running from client-side
[HIVE-5405] - Need to implement PersistenceDelegate for org.antlr.runtime.CommonToken
[HIVE-5410] - Hive command line option --auxpath still does not work post HIVE-5363
[HIVE-5413] - StorageDelegationAuthorizationProvider uses non-existent org.apache.hive.hcatalog.hbase.HBaseHCatStorageHandler
[HIVE-5416] - templeton/tests/jobsubmission2.conf erroneously removed
[HIVE-5419] - Fix schema tool issues with Oracle metastore
[HIVE-5426] - TestThriftBinaryCLIService tests fail on branch 0.12
[HIVE-5429] - HiveVarcharWritable length not reset when value is changed
[HIVE-5431] - PassthroughOutputFormat SH changes causes IllegalArgumentException
[HIVE-5433] - Fix varchar unit tests to work with hadoop-2.1.1
[HIVE-5476] - Authorization-provider tests fail in sequential run
[HIVE-5477] - maven-publish fails because it can't find hive-metastore-0.12.0.pom
[HIVE-5488] - some files are missing apache license headers
[HIVE-5489] - NOTICE copyright dates are out of date, README needs update
[HIVE-5493] - duplicate jars with different versions for guava, commons-logging
[HIVE-5497] - Hive trunk broken against hadoop 0.20.2
[HIVE-5769] - when "hive.server2.authentication" set "NONE", is "hive.server2.enable.doAs" always work?
[HIVE-5864] - Hive Table filter Not working (ERROR:SemanticException MetaException)
Improvement:
[HIVE-2084] - Upgrade datanucleus from 2.0.3 to a more recent version (3.?)
[HIVE-2608] - Do not require AS a,b,c part in LATERAL VIEW
[HIVE-2906] - Support providing some table properties by user via SQL
[HIVE-3603] - Enable client-side caching for scans on HBase
[HIVE-3725] - Add support for pulling HBase columns with prefixes
[HIVE-3764] - Support metastore version consistency check
[HIVE-3807] - Hive authorization should use short username when Kerberos authentication
[HIVE-4002] - Fetch task aggregation for simple group by query
[HIVE-4068] - Size of aggregation buffer which uses non-primitive type is not estimated correctly
[HIVE-4172] - JDBC2 does not support VOID type
[HIVE-4209] - Cache evaluation result of deterministic expression and reuse it
[HIVE-4228] - Bump up hadoop2 version in trunk
[HIVE-4241] - optimize hive.enforce.sorting and hive.enforce bucketing join
[HIVE-4268] - Beeline should support the -f option
[HIVE-4294] - Single sourced multi query cannot handle lateral view
[HIVE-4310] - optimize count(distinct) with hive.map.groupby.sorted
[HIVE-4393] - Make the deleteData flag accessable from DropTable/Partition events
[HIVE-4409] - Prevent incompatible column type changes
[HIVE-4423] - Improve RCFile::sync(long) 10x
[HIVE-4443] - [HCatalog] Have an option for GET queue to return all job information in single call
[HIVE-4444] - [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[HIVE-4459] - Script hcat is overriding HIVE_CONF_DIR variable
[HIVE-4530] - Enforce minmum ant version required in build script
[HIVE-4549] - JDBC compliance change TABLE_SCHEMA to TABLE_SCHEM
[HIVE-4579] - Create a SARG interface for RecordReaders
[HIVE-4588] - Support session level hooks for HiveServer2
[HIVE-4601] - WebHCat needs to support proxy users
[HIVE-4609] - Allow hive tests to specify an alternative to /tmp
[HIVE-4610] - HCatalog checkstyle violation after HIVE-4578
[HIVE-4617] - Asynchronous execution in HiveServer2 to run a query in non-blocking mode
[HIVE-4620] - MR temp directory conflicts in case of parallel execution mode
[HIVE-4647] - RetryingHMSHandler logs too many error messages
[HIVE-4658] - Make KW_OUTER optional in outer joins
[HIVE-4675] - Create new parallel unit test environment
[HIVE-4682] - Temporary files are not closed in PTFPersistence on jvm reuse.
[HIVE-4737] - Allow access to MapredContext
[HIVE-4772] - Enable parallel execution of various E2E tests
[HIVE-4825] - Separate MapredWork into MapWork and ReduceWork
[HIVE-4827] - Merge a Map-only task to its child task
[HIVE-4858] - Sort "show grant" result to improve usability and testability
[HIVE-4873] - Sort candidate functions in case of UDFArgumentException
[HIVE-4874] - Identical methods PTFDeserializer.addOIPropertiestoSerDePropsMap(), PTFTranslator.addOIPropertiestoSerDePropsMap()
[HIVE-4877] - In ExecReducer, remove tag from the row which will be passed to the first Operator at the Reduce-side
[HIVE-4879] - Window functions that imply order can only be registered at compile time
[HIVE-4885] - Alternative object serialization for execution plan in hive testing
[HIVE-4913] - Put deterministic ordering in the top-K ngrams output of UDF context_ngrams()
[HIVE-4920] - PTest2 handle Spot Price increases gracefully and improve rsync paralllelsim
[HIVE-4948] - WriteLockTest and ZNodeNameTest do not follow test naming pattern
[HIVE-4954] - PTFTranslator hardcodes ranking functions
[HIVE-4960] - lastAlias in CommonJoinOperator is not used
[HIVE-4967] - Don't serialize unnecessary fields in query plan
[HIVE-4985] - refactor/clean up partition name pruning to be usable inside metastore server
[HIVE-4992] - add ability to skip javadoc during build
[HIVE-5006] - Re-factor HiveServer2 JDBC PreparedStatement to avoid duplicate code
[HIVE-5027] - Upgrade Ivy to 2.3
[HIVE-5031] - [WebHCat] GET job/:jobid to return userargs for a job in addtion to status information
[HIVE-5062] - Insert + orderby + limit does not need additional RS for limiting rows
[HIVE-5111] - ExprNodeColumnDesc doesn't distinguish partition and virtual columns, causing partition pruner to receive the latter
[HIVE-5121] - Remove obsolete code on SemanticAnalyzer#genJoinTree
[HIVE-5158] - allow getting all partitions for table to also use direct SQL path
[HIVE-5182] - log more stuff via PerfLogger
[HIVE-5206] - Support parameterized primitive types
[HIVE-5209] - JDBC support for varchar
[HIVE-5267] - Use array instead of Collections if possible in DemuxOperator
[HIVE-5278] - Move some string UDFs to GenericUDFs, for better varchar support
[HIVE-5363] - HIVE-3978 broke the command line option --auxpath
New Feature:
[HIVE-305] - Port Hadoop streaming's counters/status reporters to Hive Transforms
[HIVE-1402] - Add parallel ORDER BY to Hive
[HIVE-2206] - add a new optimizer for query correlation discovery and optimization
[HIVE-2482] - Convenience UDFs for binary data type
[HIVE-2517] - Support group by on struct type
[HIVE-2655] - Ability to define functions in HQL
[HIVE-2670] - A cluster test utility for Hive
[HIVE-3255] - Add DBTokenStore to store Delegation Tokens in DB
[HIVE-4005] - Column truncation
[HIVE-4095] - Add exchange partition in Hive
[HIVE-4123] - The RLE encoding for ORC can be improved
[HIVE-4246] - Implement predicate pushdown for ORC
[HIVE-4531] - [WebHCat] Collecting task logs to hdfs
[HIVE-4614] - Support outer lateral view
[HIVE-4844] - Add varchar data type
[HIVE-4911] - Enable QOP configuration for Hive Server 2 thrift transport
[HIVE-4963] - Support in memory PTF partitions
Task:
[HIVE-4331] - Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[HIVE-4819] - Comments in CommonJoinOperator for aliasTag is not valid
[HIVE-4886] - beeline code should have apache license headers
[HIVE-4999] - Shim class HiveHarFileSystem does not have a hadoop2 counterpart
[HIVE-5059] - Meaningless warning message from TypeCheckProcFactory
[HIVE-5116] - HIVE-2608 didn't removed udtf_not_supported2.q test
[HIVE-5219] - Move VerifyingObjectStore into ql package
[HIVE-5313] - HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string)
Test:
[HIVE-4526] - auto_sortmerge_join_9.q throws NPE but test is succeeded
[HIVE-4636] - Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
[HIVE-4645] - Stat information like numFiles and totalSize is not correct when sub-directory is exists
[HIVE-4743] - Improve test coverage of package org.apache.hadoop.hive.ql.io
[HIVE-4779] - Enhance coverage of package org.apache.hadoop.hive.ql.udf
[HIVE-4791] - improve test coverage of package org.apache.hadoop.hive.ql.udf.xml
[HIVE-4796] - Increase coverage of package org.apache.hadoop.hive.common.metrics
[HIVE-4805] - Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors
[HIVE-4813] - Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr
[HIVE-5029] - direct SQL perf optimization cannot be tested well
[HIVE-5033] - Test result of ppd_vc.q is not udpated
[HIVE-5096] - Add q file tests for ORC predicate pushdown
[HIVE-5117] - orc_dictionary_threshold is not deterministic
[HIVE-5147] - Newly added test TestSessionHooks is failing on trunk
[HIVE-5197] - TestE2EScenerios.createTaskAttempt should use MapRedUtil

New in Apache Hive 0.11.0 (May 21, 2013)

Sub-task:
optimize orderby followed by a groupby
TypeInfoFactory is not thread safe and is access by multiple threads
InspectorFactories contains static HashMaps which can cause infinite loop
disable TestBeeLineDriver
disable TestBeeLineDriver in ptest util
Integrate HCatalog site into Hive site
Adjust build.xml package command to move all hcat jars and binaries into build
Move HCatalog trunk code from trunk/hcatalog/historical to trunk/hcatalog
HCatalog branches need to move out of trunk/hcatalog/historical
HCat needs to get current Hive jars instead of pulling them from maven repo
Merge HCat NOTICE file with Hive NOTICE file
Clean up remaining items in hive/hcatalog/historical/trunk
Bug:
Hive server is SHUTTING DOWN when invalid queries beeing executed.
If all of the parameters of distinct functions are exists in group by columns, query fails in runtime
ObjectInspectorConverters cannot convert Void types to Array/Map/Struct types.
should throw "Ambiguous column reference key" Exception in particular join condition
Aggregations without grouping should return NULL when applied to partitioning column of a partitionless table
Invalid tag is used for MapJoinProcessor
Filters on outer join with mapjoin hint is not applied correctly
Hive CI failing due to script_broken_pipe1.q
Comment indenting is broken for "describe" in CLI
HBase Handler doesn't handle NULLs properly
Hive compile errors under Java 7 (JDBC 4.1)
change hive.auto.convert.join's default value to true
LOAD DATA INPATH fails if a hdfs file with same name is added to table
Mixing avro and snappy gives null values
semi-colon in comments in .q file does not work
Result of outer join is not valid
HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification
user should not specify mapjoin to perform sort-merge bucketed join
Fix log4j configuration errors when running hive on hadoop23
PrimitiveObjectInspector doesn't handle timestamps properly
Merging join tree may reorder joins which could be invalid
Implement * or a.* for arguments to UDFs
Avro SerDe doesn't handle serializing Nullable types that require access to a Schema
release locks at the end of move tasks
NPE in union processing followed by lateral view followed by 2 group bys
When Group by Partition Column Type is Timestamp or STRING Which Format contains "HH:MM:SS", It will occur URISyntaxException
reflect udf cannot find method which has arguments of primitive types and String, Binary, Timestamp types mixed
script_pipe.q fails when using JDK7
RCFileWriter does not implement the right function to support Federation
HiveMetaStoreFsImpl is not compatible with hadoop viewfs
Allow URIs without port to be specified in metatool
External JAR files on HDFS can lead to race condition with hive.downloaded.resources.dir
enhanceModel.notRequired is incorrectly determined
Multiple insert overwrite into multiple tables query stores same results in all tables
Renaming table changes table location scheme/authority
Hive Query Explain Plan JSON not being created properly
Patch: Hive's ivy internal resolvers need to use sourceforge for sqlline
Hive won't compile with -Dhadoop.mr.rev=20S
make optimizing multi-group by configurable
Error in groupSetExpression rule in Hive grammar
PTest doesn't work due to hive snapshot version upgrade to 11
Driver.validateConfVariables() should perform more validations
Provide hive operation name for hookContext
JDBCStatsPublisher fails when ID length exceeds length of ID column
union_remove_9.q fails in trunk (hadoop 23)
TestNegativeMinimrCliDriver_mapreduce_stack_trace.q fails on hadoop-1
Enable adding hooks to hive meta store init
BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also
HIVE-3750 broke TestParse
Sort merge join should work if join cols are a prefix of sort columns for each partition
Unit test failures due to unspecified order of results in "show grant" command
Add MapJoinDesc.isBucketMapJoin() as part of explain plan
testCliDriver_sample_islocalmode_hook fails on hadoop-1
stats19.q is failing on trunk
Regression introduced from HIVE-3401
testCliDriver_repair fails on hadoop-1
Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
NPE in SELECT when WHERE-clause is an and/or/not operation involving null
testCliDriver_combine2 fails on hadoop-1
testCliDriver_loadpart_err fails on hadoop-1
testCliDriver_input39 fails on hadoop-1
explain dependency should show the dependencies hierarchically in presence of views
Ptest failing due to "Argument list too long" errors
Concurrency issue in RCFile: multiple threads can use the same decompressor
Adding the name space for the maven task for the maven-publish target.
Consider creating a literal like "D" or "BD" for representing Decimal type constants
bug if different serdes are used for different partitions
Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)
insert overwrite fails with stored-as-dir in cluster
Hive CLI needs UNSET TBLPROPERTY command
Insert overwrite doesn't create a dir if the skewed column position doesnt match
adding .gitattributes file for normalizing line endings during cross platform development
hive cli null representation in output is inconsistent
ppd.remove.duplicatefilters removing filters too aggressively
Aliased column in where clause for multi-groupby single reducer cannot be resolved
hour() function returns 12 hour clock value when using timestamp datatype
Multi-groupby optimization fails when same distinct column is used twice or more
Normalize left over CRLF files
Upgrade hbase dependency to 0.94
testHBaseNegativeCliDriver_cascade_dbdrop fails on hadoop-1
MAP JOIN for VIEW thorws NULL pointer exception error
lot of tests failing for hadoop 23
negative value for hive.stats.ndv.error should be disallowed
wrong mapside groupby if no partition is being selected
something wrong with the hive-default.xml
Partition pruning fails on = expression
create view statement's outputs contains the view and a temporary dir.
Wrong data due to HIVE-2820
table_access_keys_stats.q fails with hadoop 0.23
Possible deadlock in ZK lock manager
Union with map-only query on one side and two MR job query on the other produces wrong results
For outer joins, when looping over the rows looking for filtered tags, it doesn't report progress
Normalize more CRLF line endings
Change test for HIVE-2332
recursive_dir.q fails on 0.23
join_filters_overlap.q fails on 0.23
join_nullsafe.q fails on 0.23
Potential overflow with new RCFileCat column sizes options
Add Oracle metastore upgrade script for 0.9 to 10.0
Hive release tarballs don't contain PostgreSQL metastore scripts
Skewed query fails if hdfs path has special characters
MiniMR test remains pending after test completion
avro_nullable_fields.q is failing in trunk
Hive 0.10 postgres schema script is broken
Cleanup after HIVE-3403
Maintain a clear separation between Windowing & PTF at the specification level.
Update new UDAFs introduced for Windowing to work with new Decimal Type
Fix select expr processing in PTF Operator
Update PTF invocation and windowing grammar
Hive RCFile::sync(long) does a sub-sequence linear search for sync blocks
PostgreSQL upgrade scripts are not valid
Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0
Mysql metastore upgrade script will end up with different schema than the full schema load
Hive client goes into infinite loop at 100% cpu
Incorrect status for AddPartition metastore event if RawStore commit fails
MapJoin failing with Distributed Cache error
PostgreSQL upgrade scripts are creating column with incorrect name
Derby metastore update script will fail when upgrading from 0.9.0 to 0.10.0
Thrift alter_table api doesnt validate column type
Bring paranthesis handling in windowing specification in compliance with sql standard
Hive Profiler dies with NPE
Name windowing function in consistence with sql standard
NPE at runtime while selecting virtual column after joining three tables on different keys
Should be able to specify windowing spec without needing Between
Column Pruner for PTF Op
remove use of FunctionRegistry during PTF Op initialization
Hive compiler sometimes fails in semantic analysis / optimisation stage when boolean variable appears in WHERE clause.
fix ptf negative tests
Support multiple partitionings in a single Query
Disallow partition/sort and distribute/order combinations in windowing and partitioning spec
Extend rcfilecat to support (un)compressed size and no. of row
Followup to HIVE-701: reduce ambiguity in grammar
Map-join outer join produces incorrect results.
Hive eclipse build path update for string template jar
Make partition by optional in over clause
alterPartition and alterPartitions methods in ObjectStore swallow exceptions
Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
Altering a view partition fails with NPE
Add Lead & Lag UDAFs
allow expressions with over clause
Break up ptf tests in PTF, Windowing and Lead/Lag tests
PTF ColumnPruner doesn't account for Partition & Order expressions
Generated aliases for windowing expressions is broken
Use of hive.exec.script.allow.partial.consumption can produce partial results
Store complete names of tables in column access analyzer
Remove sprintf from PTFTranslator and use String.format()
decimal_3.q & decimal_serde.q fail on hadoop 2
problem in hive.map.groupby.sorted with distincts
ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids
OrcInputFormat assumes Hive always calls createValue
Remove System.gc() call from the map-join local-task loop
Hive localtask does not buffer disk-writes or reads
Hive MapJoinOperator unnecessarily deserializes values for all join-keys
Update Hive 0.10.0 RELEASE_NOTES.txt
Allow over() clause to contain an order by with no partition by
Partition by column does not have to be in order by
Default value in lag is not handled correctly
Window range specification should be more flexible
ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty
Queries fail if timestamp data not in expected format
remove support for lead/lag UDFs outside of UDAF args
Bring the Lead/Lag UDFs interface in line with Lead/Lag UDAFs
Fix eclipse template classpath to include new packages added by ORC file patch
ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
MiniDFS shim does not work for hadoop 2
Specifying alias for windowing function
Remove inferring partition specification behavior
Incorrect column mappings with over clause
bug with hive.auto.convert.join.noconditionaltask with outer joins
Cleanup aisle "ivy"
wrong results big outer joins with array of ints
HiveProfiler NPE with ScriptOperator
NPE reading column of empty string from ORC file
need to add protobuf classes to hive-exec.jar
RetryingHMSHandler doesn't retry in enough cases
Hive converts bucket map join to SMB join even when tables are not sorted
union_remove_*.q fail on hadoop 2
[REGRESSION] FsShell.close closes filesystem, removing temporary directories
Round UDF converts BigInts to double
ORC fails with files with different numbers of columns
NonBlockingOpDeDup does not merge SEL operators correctly
Filter getting dropped with PTFOperator
doAS does not work with HiveServer2 in non-kerberos mode with local job
Document HiveServer2 setup under the admin documentation on hive wiki
Document HiveServer2 JDBC and Beeline CLI in the user documentation
NPE in ReduceSinkDeDuplication
QL build-grammar target fails after HIVE-4148
TestJdbcDriver2.testDescribeTable failing consistently
ORC fails with String column that ends in lots of nulls
OVER clauses with ORDER BY not getting windowing set properly
describe table output always prints as if formatted keyword is specified
Bring windowing support inline with SQL Standard
reuse Partition objects in PTFOperator processing
Clientpositive test parenthesis_star_by is non-deteministic
Fix show_create_table_*.q test failures
explain dependency does not capture the input table
CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
hiveserver2 string representation of complex types are inconsistent with cli
Code cleanup : debug methods, having clause associated with Windowing
update show_functions.q.out for functions added for windowing
SEL operator created with missing columnExprMap for unions
union_remove_12, union_remove_13 are failing on hadoop2
union_remove_10 is failing on hadoop2 with assertion (root task with non-empty set of parents)
fix last_value UDAF behavior
fix handling of binary type in hiveserver2, jdbc driver
bug in hive.map.groupby.sorted in the presence of multiple input partitions
Limit precision of decimal type
partition wise metadata does not work for text files
Hive does not differentiate scheme and authority in file uris
TestRetryingHMSHandler is failing on trunk.
Add IntelliJ project files files to .gitignore
HCatalog build fails when behind a firewall
hiveserver2 should support -hiveconf commandline parameter
ant thriftif fails on hcatalog
Fix how RowSchema and RowResolver are set on ReduceSinkOp that precedes PTFOp
empty java files in hcatalog
Newly added test TestCliDriver.hiveprofiler_union0 is failing on trunk
DOS line endings in auto_join26.q
enable doAs in unsecure mode for hive server2, when MR job runs locally
OperatorHooks hit performance even when not used
Revert changes checked-in as part of HIVE-1953
Consider extending max limit for precision to 38
sqlline dependency is not required
NPE in constant folding with decimal
orc*.q tests fail on hadoop 2
most windowing tests fail on hadoop 2
ctas test on hadoop 2 has outdated golden file
serde_regex test fails on hadoop 2
Selecting from a view, and another view that also selects from that view fails
NPE for query involving UNION ALL with nested JOIN and UNION ALL
Guava not getting included in build package
remove duplicate impersonation parameters for hiveserver2
Check for Map side processing in PTFOp is no longer valid
wrong result in left semi join
some issue with merging join trees
Hive Version returned by HiveDatabaseMetaData.getDatabaseProductVersion is incorrect
Counters hit performance even when not used
ant maven-build fails because hcatalog doesn't have a make-pom target
test leadlag.q fails
HS2 Resource leak: operation handles not cleaned when originating session is closed
TestHCatStorer.testStoreFuncAllSimpleTypes fails because of null case difference
PTFDesc tries serialize transient fields like OIs, etc.
webhcat - support ${WEBHCAT_PREFIX}/conf/ as config directory
HCatalog unit tests stop after a failure
Improve memory usage by ORC dictionaries
hcatalog version numbers need to be updated
HCatalog build directories get included in tar file produced by "ant tar"
hcatalog jars not getting published to maven repo
ORC map columns get class cast exception in some context
TestBeeLineWithArgs.testPositiveScriptFile fails
HS2 holding too many file handles of hive_job_log_hive_*.txt files
Hive can't load transforms added using 'ADD FILE'
Fix eclipse project template
Improvement:
improve group by syntax
more query plan optimization rules
Hive should process comments in CliDriver
Upgrade antlr version to 3.4
Use name of original expression for name of CAST output
RegexSerDe should support other column types in addition to STRING
msck repair should find partitions already containing data files
Add environment context to metastore Thrift calls
Diversify grammar for split sampling
Avoid race conditions while downloading resources from non-local filesystem
Provide ALTER for partition changing bucket number
Allow CREATE TABLE LIKE command to take TBLPROPERTIES
Simple lock manager for dedicated hive server
hivetest.py: revision number and applied patch
Provide a way to use counters in Hive through UDF
sort-merge join does not work with sub-queries
Support altering partition column type in Hive
Add mapreduce workflow information to job configuration
Stop storing default ConfVars in temp file
HiveConf.ConfVars.HIVE_STATS_COLLECT_RAWDATASIZE should not be checked in FileSinkOperator
Minor fix for 'tableName' in Hive.g
de-emphasize mapjoin hint
Print number of fetched rows after query in CliDriver
Multi-insert involving bucketed/sorted table turns off merging on all outputs
Better error message if metalisteners or hookContext cannot be loaded/instantiated
Resolve TODO in TUGIBasedProcessor
object inspectors should be initialized based on partition metadata
UDF unix_timestamp is deterministic if an argument is given, but it treated as non-deterministic preventing PPD
Create a new Optimized Row Columnar file format for Hive
Better align columns in DESCRIBE table_name output to make more human-readable
Replace hashmaps in JoinOperators to array
Support noscan operation for analyze command
Remove code for merging files via MR job
merge map-job followed by map-reduce job
support partial scan for analyze command - RCFile
Clean up/fix PartitionNameWhitelistPreEventListener
Correctly enforce the memory limit on the multi-table map-join
Add o.a.h.h.serde.Constants for backward compatibility
Create abstract classes for serializer and deserializer
Add ORC file to the grammar as a file format
Remove init(fname) from TestParse.vm for each test
Swap applying order of CP and PPD
Improve Error Logging in MetaStore
Add reflect UDF for member method invocation of column
ignore mapjoin hint
Modify PreDropPartitionEvent to pass Table parameter
Refactor code for finding windowing expressions
Expose metastore JMX metrics
Support avg(decimal)
Window handling dumps debug info on console, instead should use logger.
ORC runs out of heap when writing
Sort merge join does not work for outer joins for 7 inputs
sort merge join should work for outer joins for more than 8 inputs
optimize hive.enforce.bucketing and hive.enforce sorting insert
Log logical plan tree for debugging
add hive.map.groupby.sorted.testmode
Remove unused builtins and pdk submodules
PTFDeserializer should reconstruct OIs based on InputOI passed to PTFOperator
Change default bigtable selection policy for sort-merge joins
New Feature:
Implement TRUNCATE
lots of reserved keywords in hive
Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
Infer bucketing/sorting properties
Adding the oracle nvl function to the UDF
Specify location of log4j configuration files via configuration properties
Add DECIMAL data type
Implement HiveServer2
Hive List Bucketing - DML support
HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys
Add 'IGNORE PROTECTION' predicate for dropping partitions
when output hive table to file,users should could have a separator of their own choice
Add Operator level Hooks
Support ALTER VIEW AS SELECT in Hive
Add a way to get the uncompressed/compressed sizes of columns from an RC File
getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers
Allow updating bucketing/sorting metadata of a partition through the CLI
Hive Profiler
Allow Decimal type columns in Regex Serde
Ability to create and drop temporary partition function
Allow partition by/order by in partitioning spec in over clause and partition function
Implement decimal encoding for ORC
Testing with Hadoop 2.x causes test failure for ORC's TestFileDump
Expose ORC's FileDump as a service
Implement a memory manager for ORC
Task:
Unescape partition names returned by show partitions
Add check to determine whether partition can be dropped at Semantic Analysis time
ALTER TABLE ADD PARTS should check for valid partition spec and throw a SemanticException if part spec is not valid
Add input table name to MetaStoreEndFunctionContext for logging purposes
Track columns accessed in each table in a query
Split up tests in ptf_general_queries.q
Merge PTFDesc and PTFDef classes
Add apache headers in new files
Create hcatalog stub directory and add it to the build
Test:
add a way to run a small unit quickly
Remove redundant test codes
Make accept qfile argument for miniMR tests
TestMetaStoreAuthorization always uses the same port
Add more tests for windowing
add tests for distincts for hive.map.groutp.sorted
Update list bucketing test results
Wish:
Result of mapjoin_test_outer.q is not deterministic

New in Apache Hive 0.10.0 (May 21, 2013)

Sub-task:
Optimizer statistics on columns in tables and partitions
Support external hive tables whose data are stored in Azure blob store/Azure Storage Volumes (ASV)
Remove the duplicate JAR entries from the (“test.classpath”) to avoid command line exceeding char limit on windows
Windows: Fix the unit tests which contains “!” commands (Unix shell commands)
FileUtils.tar does not close input files
Fix “TestDosToUnix” unit tests on Windows by closing the leaking file handle in DosToUnix.java.
Fix the “TestHiveHistory”, “TestHiveConf”, & “TestExecDriver” unit tests on Windows by fixing the path related issues.
Handle “CRLF” line endings to avoid the extra spacing in generated test outputs in Windows. (Utilities.Java :: readColumn)
Remove the Unix specific absolute path of “Cat” utility in several .q files to make them run on Windows with CygWin in path.
PartitionPruner should log why it is not pushing the filter down to JDO
Bug:
cluster by multiple columns does not work if parenthesis is present
Nested UDAFs cause Hive Internal Error (NullPointerException)
DESCRIBE TABLE syntax doesn't support specifying a database qualified table name
mapjoin sometimes gives wrong results if there is a filter in the on condition
java.io.IOException: error=7, Argument list too long
Group by operator does not estimate size of Timestamp & Binary data correctly
LATERAL VIEW with EXPLODE produces ConcurrentModificationException
DROP DATABASE CASCADE does not drop non-native tables.
Nullpointer on registering udfs.
Hive Ivy dependencies on Hadoop should depend on jars directly, not tarballs
Make the header of RCFile unique
Upgrade Thrift dependency to 0.9.0
ability to select a view qualified by the database / schema name
Reduce Sink deduplication fails if the child reduce sink is followed by a join
Hive UDFs cannot emit binary constants
hive can't find hadoop executor scripts without HADOOP_HOME set
When integrating into MapReduce2, Hive is unable to handle corrupt rcfile archive
query_properties.q contains non-deterministic queries
NPE in "create index" without comment clause in external metastore
utc_from_timestamp and utc_to_timestamp returns incorrect results.
Task log retrieval fails on Hadoop 0.23
TestNegativeCliDriver autolocal1.q fails on 0.23
Renaming external partition changes location
ant gen-test failed
Hive error when dropping a table with large number of partitions
Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI
race condition in DAG execute tasks for hive
analyze command throw NPE when table doesn't exists
Hive should expand nested structs when setting the table schema from thrift structs
substr on string containing UTF-8 characters produces StringIndexOutOfBoundsException
Queries consists of metadata-only-query returns always empty value
Hive JDBC doesn't support TIMESTAMP column
metastore delegation token is not getting used by hive commandline
GET_JSON_OBJECT fails on some valid JSON keys
Filter parsing does not recognize '!=' as operator and silently ignores invalid tokens
Fix maven-build Ant target
Fix test failure in TestNegativeCliDriver.dyn_part_max caused by HIVE-2918
Remove hadoop-source Ivy resolvers and Ant targets
Offline build is not working
Potential infinite loop / log spew in ZookeeperHiveLockManager
Memory leak in TUGIContainingTransport
TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly
Fix metastore test failures caused by HIVE-2757
Add JUnit to list of test dependencies managed by Ivy
Tests failing for me
Fix javadoc again
Update ShimLoader to work with Hadoop 2.x
escape more chars for script operator
hive docs target does not work
Modify clean target to remove ~/.ivy2/local/org.apache.hive ~/.ivy2/cache/org.apache.hive
Partition column values are not valid if any of virtual columns is selected
setup classpath for templates correctly for eclipse
TestHadoop20SAuthBridge always uses the same port
metastore.HiveMetaStore$HMSHandler should set the thread local raw store to null in shutdown()
hive.transform.escape.input breaks tab delimited data
revert HIVE-2703
Insert into table overwrites existing table if table name contains uppercase character
drop partition for non-string columns is failing
Drop partition problem
Filter on outer join condition removed while merging join tree
drop partition does not work for non-partition columns
Revert HIVE-2989
ROFL Moment. Numberator and denaminator typos
Oracle Metastore schema script doesn't include DDL for DN internal tables
make parallel tests work
Timestamp type values not having nano-second part breaks row
Hive tests should load Hive classes from build directory, not Ivy cache
Memory leak from large number of FileSystem instances in FileSystem.CACHE
Add HiveCLI that runs over JDBC
dropTable will all ways excute hook.rollbackDropTable whether drop table success or faild.
clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed
make copyLocal work for parallel tests
Hadoop20Shim. CombineFileRecordReader does not report progress within files
Error in Removing ProtectMode from a Table
sort_array doesn't work with LazyPrimitive
Generate & build the velocity based Hive tests on windows by fixing the path issues
Pass hconf values as XML instead of command line arguments to child JVM
use commons-compress instead of forking tar process
Drop table/index/database can result in orphaned locations
add an option in ptest to run on a single machine
Comment indenting is broken for "describe" in CLI
Bug in parallel test for singlehost flag
Dynamically generated paritions deleted by Block level merge
drop the temporary function at end of autogen_colalias.q
Fix non-deterministic testcases failures when running Hive0.9.0 on MapReduce2
Hive thrift code doesnt generate quality hashCode()
LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of underlying BytesWritable
Bucketed sort merge join doesn't work when multiple files exist for small alias
retry not honored in RetryingRawMetastore
Fix Eclipse classpath template broken in HIVE-3128
Drop partition throws NPE if table doesn't exist
Bucketed mapjoin on partitioned table which has no partition throws NPE
FileUtils.tar assumes wrong directory in some cases
JobDebugger should use RunningJob.getTrackingURL
Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly
HiveConf.getPositionFromInternalName does not support more than sinle digit column numbers
NPE on a join query with authorization enabled
ColumnPruner is not working on LateralView
Make logging of plan progress in HadoopJobExecHelper configurable
Resource Leak: Fix the File handle leak in EximUtil.java
Fix non-deterministic results in newline.q and timestamp_lazy.q
Fix cascade_dbdrop.q when building hive on hadoop0.23
ignore white space between entries of hive/hbase table mapping
java primitive type for binary datatype should be byte[]
Sorted by order of table not respected
lack of semi-colon in .q file leads to missing the next statement
Upgrade guava to 11.0.2
Hive doesn't remove scrach directories while killing running MR job
Fix avro_joins.q testcase failure when building hive on hadoop0.23
alter the number of buckets for a non-empty partitioned table should not be allowed
bucketed mapjoin silently ignores mapjoin hint
HiveHistory.printRowCount() throws NPE
escaped columns in cluster/distribute/order/sort by are not working
expressions in cluster by are not working
Add avro jars into hive execution classpath
Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
optimize union sub-queries
Table schema not being copied to Partitions with no columns
Convert runtime exceptions to semantic exceptions for missing partitions/tables in show/describe statements
bucket information should be used from the partition instead of the table
sort merge join may not work silently
fix fs resolvers
Load file into a table does not update table statistics
HIVE-3128 introduced bug causing dynamic partitioning to fail
Fix quote printing bug in mapreduce_stack_trace.q testcase failure when running hive on hadoop23
Race condition in query plan for merging at the end of a query
Fix error code inconsistency bug in mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q when running hive on hadoop23
SMBJoin/BucketMapJoin should be allowed only when join key expression is exactly matches with sort/cluster key
[Regression] TestMTQueries test is failing on trunk
Convert runtime exceptions to semantic exceptions for validation of alter table commands
Archives broken for hadoop 1.0
Change the rules in SemanticAnalyzer to use Operator.getName() instead of hardcoded names
shims unit test failures fails further test progress
Making hive tests run against different MR versions
Hive: Query misaligned result for Group by followed by Join with filter and skip a group-by result
Add junit exclude utility to disable testcases
Upgrade Hive's Avro dependency to version 1.7
bucketed map join should check that the number of files match the number of buckets
stats are not being collected correctly for analyze table with dynamic partitions
fpair on creating external table
Hive Metatool should take serde_param_key from the user to allow for changes to avro serde's schema url key
GenMRSkewJoinProcessor uses File.Separator instead of Path.Separator
map-reduce jobs does not work for a partition containing sub-directories
Missing column causes null pointer exception
Parallel test script doesnt run all tests
Dynamic partition queries producing no partitions fail with hive.stats.reliable=true
hive unit tests fail to get lock using zookeeper on windows
insert into statement overwrites if target table is prefixed with database name
Duplicate data possible with speculative execution for dynamic partitions
Remove the specialized logic to handle the file schemas in windows vs unix from build.xml
Bug fix: Return the child JVM exit code to the parent process to handle the error conditions
: Fix the file handle leaks in Symbolic & Symlink related input formats.
: Hiveserver is not closing the existing driver handle before executing the next command. It results in to file handle leaks.
joins using partitioned table give incorrect results on windows
RetryingRawStore logic needs to be significantly reworked to support retries within transactions
Hive List Bucketing - Skewed DDL doesn't support skewed value with string quote
CTAS in database with location on non-default name node fails
Some of the Metastore unit tests failing on Windows because of the static variables initialization problem in HiveConf class.
aggName of SemanticAnalyzer.getGenericUDAFEvaluator is generated in two different ways
Some of the JDBC test cases are failing on Windows because of the longer class path.
For UDAFs, when generating a plan without map-side-aggregation, constant agg parameters will be replaced by ExprNodeColumnDesc
Query plan for multi-join where the third table joined is a subquery containing a map-only union with hive.auto.convert.join=true is wrong
Avoid NPE in skewed information read
hivetest.py fails with --revision option
log4j template has logging threshold that hides all audit logs
Some of the tests are not deterministic
metadata_export_drop.q causes failure of other tests
QTestUtil side-effects
partition to directory comparison in CombineHiveInputFormat needs to accept partitions dir without scheme
ivysettings.xml does not let you override .m2/repository
Make separator for Entity name configurable
Hive info logging is broken
Avro Maps with Nullable Values fail with NPE
Incorrect partition bucket/sort metadata when overwriting partition with different metadata from table
ZooKeeperHiveLockManager does not respect the option to keep locks alive even after the current session has closed
derby metastore upgrade script throw errors when updating from 0.7 to 0.8
Output of sort merge join is no longer bucketed
union involving double column with a map join subquery will fail or give wrong results
Test "Path -> Alias" for explain extended
Hive always prints a warning message when using remote metastore
Drop database cascade fails when there are indexes on any tables
get_json_object and json_tuple return null in the presence of new line characters
Regression - HiveConf static variable causes issues in long running JVM insname of some metastore scripts are not per convention
Use varbinary instead of longvarbinary to store min and max column values in column stats schema
Metastore: Sporadic unit test failures
Create index fails on CLI using remote metastore
Hive Driver leaks ZooKeeper connections
Metastore tests use hardcoded ports
Error in groupSetExpression rule in Hive grammar
Multiple aggregates in query fail the job
PTest doesn't work due to hive snapshot version upgrade to 11
hive unit test case build failure.
The derby metastore schema script for 0.10.0 doesn't run
Must publish new Hive-0.10 artifacts to apache repository.
RetryingMetaStoreClient Should Log the Caught Exception
hive pom file has missing conf and scope mapping for compile configuration.
Oracle upgrade script for Hive is broken
Cannot drop partitions on table when using Oracle metastore
Hive JIRA still shows 0.10 as unreleased in "Affects Version/s" dropdown
HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH
TestCase TestMTQueries fails with Non-Sun Java
Doc update for .8, .9 and .10
closeAllForUGI causes failure in hiveserver2 when fetching large amount of data
Improvement:
Ability to enforce correct stats
Add a configuration property that sets the variable substitution max depth
metastore 0.8 upgrade script for PostgreSQL
Collapse hive.metastore.uris and hive.metastore.local
Support auto completion for hive configs in CliDriver
Add validation to HiveConf ConfVars
Improve the HWI interface
Move global .hiverc file
Support non-MR fetching for simple queries with select/limit/filter operations only
[hive] Provide error message when using UDAF in the place of UDF instead of throwing NPE
pass a environment context to metastore thrift APIs
hive custom scripts do not work well if the data contains new lines
Make the new header for RC Files introduced in HIVE-2711 optional
Collect_set Aggregate does uneccesary check for value.
JDBC cannot find metadata for tables/columns containing uppercase character
Improve HiveMetaStore logging
add findbugs in build.xml
Add option to make multi inserts more atomic
Release codecs and output streams between flushes of RCFile
Typo in dynamic partitioning code bits, says "genereated" instead of "generated" in some places.
Add hive command for resetting hive confs
Support Bucketed mapjoin on partitioned table which has two or more partitions
BucketizedHiveInputFormat should be automatically used with SMBJoin
getting the reporter in the recordwriter
Enable Metastore audit logging for non-secure connections
Propagates filters which are on the join condition transitively
enum to string conversions
Create Table Like should copy configured Table Parameters
As a follow up for HIVE-3276, optimize union for dynamic partition queries
Keep the original query in HiveDriverRunHookContextImpl
get_json_object and json_tuple should use Jackson library
.23 compatibility: shim job.tracker.address
Add Retries to Hive MetaStore Connections
Yet better error message in CLI on invalid column name
All operators's conf should inherit from a common class
Support partial partition specifications in when enabling/disabling protections in Hive
perform a map-only group by if grouping key matches the sorting properties of the table
Provide backward compatibility for AvroSerDe properties
Hive maven-publish ant task should be configurable
To add instrumentation to capture if there is skew in reducers
Log client IP address with command in metastore's startFunction method
Allow Partition Offline Enable/Disable command to be specified at the ds level even when Partition is based on more columns than ds
Refactor Partition Pruner so that logic can be reused.
Storing certain Exception objects thrown in HiveMetaStore.java in MetaStoreEndFunctionContext
Early skipping for limit operator at reduce stage
Access to external URLs in hivetest.py
Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
Revert HIVE-3268
TCP KeepAlive and connection timeout for the HiveServer
Make prompt in Hive CLI configurable
Reset operator-id before executing parse tests
RetryingHMSHandler should wrap JDOException inside MetaException
Catch the NPe when using ^D to exit from CLI
getBoolVar in FileSinkOperator can be optimized
Round map/reduce progress down when it is in the range [99.5, 100)
New Feature:
Allow SELECT without a mapreduce job
Add SerDe for Avro serialized data
Implement "show create table"
Support with rollup option for group by
replace or translate function in hive
Implement SHOW TBLPROPERTIES
Support standard cross join syntax
Add FORMAT UDF
Optionally use framed transport with metastore
SHOW COLUMNS table_name; to provide a comma-delimited list of columns.
Support for Oracle-backed Hive-Metastore ("longvarchar" to "clob" in package.jdo)
Returning Meaningful Error Codes & Messages
Create a new metastore tool to bulk update location field in Db/Table/Partition records
Add the option -database DATABASE in hive cli to specify a default database to use for the cli session.
Add ability to export table metadata as JSON on table drop
Hive List Bucketing - DDL support
Skewed Join Optimization
Disallow certain character patterns in partition names
A table generating, table generating function
sort merge join should work if both the tables are sorted in descending order
Implement CUBE and ROLLUP operators in Hive
Implement grouping sets in hive
Hive List Bucketing - Query logic
Add a command "Explain dependency ..."
Hive List Bucketing - set hive.mapred.supports.subdirectories
Hive List Bucketing - enhance DDL to specify list bucketing table
Adding authorization capability to the metastore
Add support for phonetic algorithms in Hive
Task:
Move RegexSerDe out of hive-contrib and over to hive-serde
RCFileMergeMapper Prints To Standard Output Even In Silent Mode
Implement INCLUDE_HADOOP_MAJOR_VERSION test macro
Revert HIVE-2986
Add hive.exec.rcfile.use.explicit.header to hive-default.xml.template
hive.binary.record.max.length is a magic string
Extract global limit configuration to optimizer
Improve Performance of UDF PERCENTILE_APPROX()
Track table and keys used in joins and group bys for logging
Unescape partition names returned by show partitions
Update website with info on how to report security bugs
Test:
TestHiveServerSessions hangs when executed directly
TestRemoteHiveMetaStoreIpAddress always uses the same port
Stop testing concat of partitions containing control characters.
Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk
Add tests for 'm' bigs tables sortmerge join with 'n' small tables where both m,n>1
add tests to use bucketing metadata for partitions
Add more tests where output of sort merge join is sorted
New test cases added by HIVE-3676 in insert1.q is not deterministic
Wish:
Log Time To Submit metric with PerfLogger

New in Apache Hive 0.9.0 (May 21, 2013)

Sub-task:
add DOAP file for Hive
Enable/Add type-specific compression for rcfile
Move retry logic in HiveMetaStore to a separe class
Add support for filter pushdown for key ranges in hbase for keys of type string
Bug:
Hive Server getSchema() returns wrong schema for "Explain" queries
"hdfs" is hardcoded in few places in the code which inhibits use of other file systems
show functions also returns internal operators
Not using map aggregation, fails to execute group-by after cluster-by with same key
HiveServer should provide per session configuration
Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory
left semi join will duplicate data
Compact index table's files merged in creation
Passing user identity from metastore client to server in non-secure mode
Insert overwrite table db.tname fails if partition already exists
Describe partition returns table columns but should return partition columns
Make a single Hive binary work with both 0.20.x and 0.23.0
Make Hive work with Hadoop 1.0.0
ignore exception for external jars via reflection
wrong class loader used for external jars
Force Bash shell on parallel test slave nodes
Parallel tests fail if master directory is not present
Allow multiple ptest runs by the same person
Parallel test commands that include cd fail
"hive.querylog.location" requires parent directory to be exist or else folder creation fails
builtins JAR is not being published to Maven repo & hive-cli POM does not depend on it either
Need better exception handling in RCFile tolerate corruptions mode
StackOverflowError when using custom UDF in map join
Eclipse launch configurations fail due to unsatisfied builtins JAR dependency
get_partitions_ps throws TApplicationException if table doesn't exist
SUCESS is misspelled
a bug in 'alter table concatenate' that causes filenames getting double url encoded
SemanticAnalyzer twice swallows an exception it shouldn't
StackOverflowError when using custom UDF after adding archive after adding jars
Lots of special characters are not handled in LIKE
NPE in union followed by join
Remove unused lib/log4j-1.2.15.jar
Fix flaky testing infrastructure
Fix some nondeterministic test output
PlanUtils.configureTableJobPropertiesForStorageHandler() is not called for partitioned table
Single binary built against 0.20 and 0.23, does not work against 0.23 clusters.
Metastore client doesn't log properly in case of connection failure to server
CONV returns incorrect results sometimes
Hive multi group by single reducer optimization causes invalid column reference error
Remove empty java files
NPE in union with lateral view
union follwowed by union_subq does not work if the subquery union has reducers
Metastore is caching too aggressively
Change global_limit.q into linux format file
Remove lib/javaewah-0.3.jar
Alter Table Partition Concatenate Fails On Certain Characters
union with a multi-table insert is not working
make union31.q deterministic
Fail on table sampling
New BINARY type produces unexpected results with supported UDFS when using MapReduce2
filter is still removed due to regression of HIVE-1538 althougth HIVE-2344
SUBSTR(CAST( AS BINARY)) produces unexpected results
Disable loadpart_err.q on 0.23
Export LANG=en_US.UTF-8 to environment while running tests
typo in configuration parameter
TestContribCliDriver.dboutput and TestCliDriver.input45 fail on 0.23
Fix test failures caused by HIVE-2716
insert into external tables should not be allowed
cleanup readentity/writeentity
INPUT__FILE__NAME virtual column returns unqualified paths on Hadoop 0.23
Fix TestCliDriver escape1.q failure on MR2
QTestUtil.cleanUp() fails with FileNotException on 0.23
Ambiguous table name or column reference message displays when table and column names are the same
Renaming partition changes partition location prefix
Metastore client doesnt close connection properly
Hive union with NULL constant and string in same column returns all null
BlockMergeTask Doesn't Honor Job Configuration Properties when used directly
TestStatsPublisherEnhanced throws NPE on JDBC connection failure
testAclPositive in TestZooKeeperTokenStore failing in clean checkout when run on Mac
HiveFileFormatUtils should use Path.SEPARATOR instead of File.Separator
GROUP BY causing ClassCastException [LazyDioInteger cannot be cast LazyInteger]
several jars in hive tar generated are not required
JOIN + LATERAL VIEW + MAPJOIN fails to return result (seems to stop halfway through and no longer do the final reduce part)
Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data
TestCliDriver (script_pipe.q) failed with IBM JDK
Doc update for .8, .9 and .10
Improvement:
use sed rather than diff for masking out noise in diff-based tests
parallelize test query runs
Add java_method() as a synonym for the reflect() UDF
Extend concat_ws() UDF to support arrays of strings
When creating constant expression for numbers, try to infer type from another comparison operand, instead of trying to use integer first, and then long and double
Add timestamp column to the partition stats table.
pull junit jar from maven repos via ivy
Add target to install Hive JARs/POMs in the local Maven cache
Expose the HiveConf in HiveConnection API
Newly created partition should inherit properties from table
Make index table output of create index command if index is table based
move one line log from MapOperator to HiveContextAwareRecordReader
Add alterPartition to AlterHandler interface
fix Hive-2566 and make union optimization more aggressive
The variable hive.exec.mode.local.auto.tasks.max should be changed
Change arc config to hide generated files from Differential by default
Add Ant configuration property for dumping classpath of tests
Support for metastore service specific HADOOP_OPTS environment setting
The row count that loaded to a table may not right
Add 'ivy-clean-cache' and 'very-clean' Ant targets
Make ZooKeeper token store ACL configurable
Views should be added to the inputs of queries.
TestCliDriver should log elapsed time
Obtain delegation tokens for MR jobs in secure hbase setup
hbase handler uses ZooKeeperConnectionException which is not compatible with HBase versions other than 0.89
HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
Improve hooks run in Driver
HBaseSerDe should allow users to specify the timestamp passed to Puts
View partitions do not have a storage descriptor
Make the IP address of a Thrift client available to HMSHandler.
Add logging of total run time of Driver
Concatenating a partition does not inherit location from table
Implement nullsafe equi-join
Cache error messages for additional logging
Change default configuration for hive.exec.dynamic.partition
Fix javadoc warnings
Remove zero length files
Add pre event listeners to metastore
Cache remote map reduce job stack traces for additional logging
Support eventual constant expression for filter pushdown for key ranges in hbase
If hive history file's directory doesn't exist don't crash
hive-config.sh should honor HIVE_HOME env
Cache local map reduce job errors for additional logging
Add a new hook to run at the beginning and end of the Driver.run method
Store which configs the user has explicitly changed
Add "rat" target to build to look for missing license headers
Remove redundant key comparing in SMBMapJoinOperator
TextConverter for UDF's is inefficient if the input object is already Text or Lazy
Hive: Extend ALTER TABLE DROP PARTITION syntax to use all comparators
Add license to the Hive files
Hive metastore does not have any log messages while shutting itself down.
Remove need for storage descriptors for view partitions
Add support for filter pushdown for composite keys
New Feature
Allow access to Primitive types stored in binary format in HBase
Implement BETWEEN operator
Implement sort_array UDF
Add reset operation and average time attribute to Metrics MBean.
add support for insert partition overwrite(...) if not exists
support hive table/partitions exists in more than one region
Allow multiple group bys with the same input data and spray keys to be run on the same reducer.
Add PRINTF() Udf
Enable Hadoop-1.0.0 in Hive
Implement NULL-safe equality operator
Filter pushdown in hbase for keys stored in binary format
Closed range scans on hbase keys
Add JSON output to the hive ddl commands
RCFile Reader doesn't provide access to Metadata
Add nicer helper functions for adding and reading metadata from RCFiles
Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory
Task:
Hive Web Server startup messages logs incorrect path it is searching for WAR
Fix test failures caused by HIVE-2589
Upgrade Hbase and ZK dependcies
Add a getAuthorizationProvider to HiveStorageHandler
Move metastore upgrade scripts labeled 0.10.0 into scripts labeled 0.9.0
Remove unnecessary JAR dependencies
Revert HIVE-2612
Revert HIVE-2795
Row number issue in hive
Test:
Test ppr_pushdown.q is failing on trunk
add a testcase for partitioned view on union and base tables have index
Wish:
Clean-up logs

New in Apache Hive 0.8.0 (May 21, 2013)

New Feature:
Add TIMESTAMP column type for thrift dynamic_type
Support "INSERT [INTO] destination"
Triggers when a new partition is created for a table
Create a Hive CLI that connects to hive ThriftServer
Allow type widening on COALESCE/UNION ALL
Add support of columnar binary serde
optimize metadata only queries
Partitioning columns should be of primitive types only
add an interface in RCFile to support concatenation of two files without (de)compression
Allow users to specify LOCATION in CREATE DATABASE statement
Accelerate GROUP BY execution using indexes
Implement map_keys() and map_values() UDFs
Extend Explode UDTF to handle Maps
Implement bitmap indexing in Hive
Add export/import facilities to the hive system
support explicit view partitioning
Block merge for RCFile
Add "DROP DATABASE ... CASCADE/RESTRICT"
Input Sampling By Splits
extend table statistics to store the size of uncompressed data (+extend interfaces for collecting other types of statistics)
Add get_table_objects_by_name() to Hive MetaStore
Add api for marking / querying set of partitions for events
support grouping on complex types in Hive
Purge expired events
Cli: Print Hadoop's CPU milliseconds
Add a Plugin Developer Kit to Hive
add TIMESTAMP data type
Support archiving for multiple partitions if the table is partitioned by multiple columns
Add Binary Datatype in Hive
Allow Hive to be debugged remotely
Literal bigint
Allow UDFs to specify additional FILE/JAR resources necessary for execution
Bug;
better error code from Hive describe command
Join operation fails for some queries
Improve the error messages for missing/incorrect UDF/UDAF class
CREATE TABLE t LIKE some_view should create a new empty base table, but instead creates a copy of view
describe parse_url throws an error
Predicate push down get error result when sub-queries have the same alias name
Clean up references to 'hive.metastore.local'
FilterOperator is applied twice with ppd on.
ProxyFileSystem.close calls super.close twice.
job name for alter table archive partition is not correct
JDBC driver returns wrong precision, scale, or column size for some data types
SAXParseException on plan.xml during local mode.
Different defaults for hive.metastore.local
alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?)
Potential risk of resource leaks in Hive
DDLSemanticAnalyzer won't take newly set Hive parameters
Metastore operations (like drop_partition) could be improved in terms of maintaining consistency of metadata and data
Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.
Don't set ivy.home in build-common.xml
Auto convert mapjoin should not throw exception if the top operator is union operator.
Getting error when join on tables where name of table has uppercase letters
In error scenario some opened streams may not closed in ScriptOperator.java, Utilities.java
"insert overwrite directory" Not able to insert data with multi level directory path
Exception should be thrown when invalid jar,file,archive is given to add command
Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts
HWI admin_list_jobs JSP page throws exception
Make the delegation token issued by the MetaStore owned by the right user
Add inputs and outputs to authorization DDL commands
LOAD compilation does not set the outputs during semantic analysis resulting in no authorization checks being done for it.
keyword_1.q is failing
Making JDO thread-safe by default
In Driver.execute(), mapred.job.tracker is not restored if one of the task fails.
Fix TestEmbeddedHiveMetaStore and TestRemoteHiveMetaStore broken by HIVE-2022
Correct the exception message for the better traceability for the scenario load into the partitioned table having 2 partitions by specifying only one partition in the load statement.
create database does not honour warehouse.dir in dbproperties
A database's warehouse.dir is not used for tables created in it.
Backport HIVE-1991 after overridden by HIVE-1950
Merge result file size should honor hive.merge.size.per.task
the retry logic in Hive's concurrency is not working correctly.
In error scenario some opened streams may not closed
TCTLSeparatedProtocol.SimpleTransportTokenizer.nextToken() throws Null Pointer Exception in some cases
Exception on windows when using the jdbc driver. "IOException: The system cannot find the path specified"
CLI local mode hit NPE when exiting by ^D
Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
HivePreparedStatement.executeImmediate always throw exception
NullPointerException on getSchemas
Few code improvements in the ql and serde packages.
Bug: RowContainer was set to 1 in JoinUtils.
Add test coverage for external table data loss issue
auto convert map join bug
throw a error if the input is larger than a threshold for index input format
Make couple of convenience methods in EximUtil public
virtual column references inside subqueries cause execution exceptions
Log4J initialization info should not be printed out if -S is specified
In shell mode, local mode continues if a local-mode task throws exception in pre-hooks
insert overwrite ignoring partition location
auto convert map join may miss good candidates
Remove usage of deprecated methods from org.apache.hadoop.io package
alter table concatenate fails and deletes data
Bitmap Operation UDF doesn't clear return list
Exception when no splits returned from index
Jobs do not get killed even when they created too many files.
NPE during parsing order-by expression
Block Sampling should adjust number of reducers accordingly to make it useful
Too many open files in running negative cli tests
Stats JDBC LIKE queries should escape '_' and '%'
NPE in MapJoinObjectKey
TableSample(percent ) uses one intermediate size to be int, which overflows for large sampled size, making the sampling never triggered.
Few code improvements in the metastore,hwi and ql packages.
Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus
Log related Check style Comments fixes
Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.
Avoid null pointer exception when executing UDF
In Task class and its subclasses logger is initialized in constructor
Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
Dynamic Partitioning Failing because of characters not supported globStatus
Stats table schema incompatible after HIVE-2185
Ensure HiveConf includes all properties defined in hive-default.xml
SessionState used before ThreadLocal set
While using Hive in server mode, HiveConnection.close() is not cleaning up server side resources
incorrect success flag passed to jobClose
unable to get column names for a specific table that has '_' as part of its table name
Fix a bug caused by HIVE-243
CommandNeedRetryException.java is missing ASF header
runnable queue in Driver and DriverContext is not thread safe
hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
Can't publish maven release artifacts to apache repository
Comparison Operators convert number types to common type instead of double if possible
Merge failing of join tree in exceptional case
Enable TestHadoop20SAuthBridge
Skip comments in hive script
ExecDriver::addInputPaths should pass the table properties to the record writer
Revert HIVE-2219 and apply correct patch to improve the efficiency of dropping multiple partitions
Fix Inconsistency between RB and JIRA patches for HIVE-2194
Regression introduced from HIVE-2155
ClassCastException when building index with security.authorization turned on
Error during UNARCHIVE of a partition
Comment clause should immediately follow identifier field in CREATE DATABASE statement
Allow ShimLoader to work with Hadoop 0.20-append
bad compressed file names from insert into
Fix UDAFPercentile to tolerate null percentiles
files with control-A,B are not delimited correctly.
Schema creation scripts for PostgreSQL use bit(1) instead of boolean
Incorrect regular expression for extracting task id from filename
DatabaseMetadata.getColumns() does not return partition column names for a table
Calling alter_table after changing partition comment throws an exception
Add ColumnarSerDe to the list of native SerDes
Turn off bitmap indexing when map-side aggregation is turned off
hive.zookeeper.session.timeout is set to null in hive-default.xml
Turn off compression when generating index intermediate results
DESCRIBE TABLE causes NPE when hive.cli.print.header=true
Indexes are still automatically queried when out of sync with their source tables
Predicate pushdown erroneously conservative with outer joins
Alter table always throws an unhelpful error on failure
mirror.facebook.net is 404ing
stats not updated for non "load table desc" operations
filter is removed due to regression of HIVE-1538
Fix udtf_explode.q and udf_explode.q test failures
JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types
HiveConf properties not appearing in the output of 'set' or 'set -v'
Metastore upgrade scripts for HIVE-2246 do not migrate indexes nor rename the old COLUMNS table
Slow dropping of partitions caused by full listing of storage descriptors
Minor typo in error message in HiveConnection.java (JDBC)
Invalid predicate pushdown from incorrect column expression map for select operator generated by GROUP BY operation
Incorrect alias filtering for predicate pushdown
import of multiple partitions from a partitioned table with external location overwrites files
Add Mockito to LICENSE file
published POMs in Maven repo are incorrect
Fix whitespace test diff accidentally introduced in HIVE-1360
Hive server doesn't return schema for 'set' command
Function like with empty string is throwing null pointer exception
get_privilege does not get user level privilege
File extensions not preserved in Hive.checkPaths when renaming new destination file
Metastore server tries to connect to NN without authenticating itself
Update Eclipse configuration to include Mockito dependency
BlockMergeTask ignores client-specified jars
Merging of compressed rcfiles fails to write the valuebuffer part correctly
skip corruption bug that cause data not decompressed
upgrading thrift version didn't upgrade libthrift.jar symlink correctly
TABLESAMBLE(BUCKET xxx) sometimes doesn't trigger input pruning as regression of HIVE-1538
Pass correct remoteAddress in proxy user authentication
remove all @author tags from source
fix Eclipse for javaewah upgrade
Primitive Data Types returning null if the data is out of range of the data type.
mapjoin_subquery dump small table (mapjoin table) to the same file
Metastore statistics are not being updated for CTAS queries.
Hive PDK needs an Ivy configuration file
HadoopJobExecHelper does not handle null counters well
Phabricator for code review
Bug from HIVE-2446, the code that calls client stats publishers run() methods is in wrong place, should be in the same method but inside of while (!rj.isComplete()) {} loop
PDK tests failing on Hudson because HADOOP_HOME is not defined
PDK PluginTest failing on Hudson
partition pruning prune some right partition under specific conditions
small table filesize for automapjoin is not consistent in HiveConf.java and hive-default.xml
When new instance of Hive (class) is created, the current database is reset to default (current database shouldn't be changed).
Hive throws Null Pointer Exception upon CREATE TABLE . .... if the given doesn't exist
cleaunup QTestUtil: use test.data.files as current directory if one not specified
Dynamic partition insert should enforce the order of the partition spec is the same as the one in schema
HIVE-2446 bug (next one) - If constructor of ClientStatsPublisher throws runtime exception it will be propagated to HadoopJobExecHelper's progress method and beyond, whereas it shouldn't
Allow people to use only issue numbers without 'HIVE-' prefix with `arc diff --jira`.
Evaluation of non-deterministic/stateful UDFs should not be skipped even if constant oi is returned.
HiveIndexResult creation fails due to file system issue
Support scientific notation for Double literals
How to submit documentation fixes
Provide jira_base_url for improved arc commit workflow
upgrade script 008-HIVE-2246.mysql.sql contains syntax errors
HIVE-2247 Changed the Thrift API causing compatibility issues.
Add Java linter to Hive
HIVE-2246 upgrade script needs to drop foreign key in COLUMNS_OLD
eclipse template .classpath is broken
HIVE-2246 upgrade script changed the COLUMNS_V2.COMMENT length
ivy offline mode broken by changingPattern and checkmodified attributes
Debug mode in some situations doesn't work properly when child JVM is started from MapRedLocalTask
Hive build fails with error "java.io.IOException: Not in GZIP format"
explain task: getJSONPlan throws a NPE if the ast is null
bug in ivy 2.2.0 breaks build
Update arcconfig to include commit listener
HBase bulk load wiki page improvements
Update README.txt file to use description from wiki
HiveCli eclipse launch configuration hangs
Hive POMs reference the wrong Hadoop artifacts
Fix eclipse classpath template broken in HIVE-2523
Fix maven-build Ant target
TestHiveServer doesn't produce a JUnit report file
revert HIVE-2566
Recent patch prevents Hadoop confs from loading in 0.20.204
Improvement:
CREATE VIEW followup: CREATE OR REPLACE
Allow UDFs to access constant parameter values at compile time
increase hive.mapjoin.maxsize to 10 million
use filter pushdown for automatically accessing indexes
HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack
Improve miscellaneous error messages
support NOT IN and NOT LIKE syntax
HiveInputFormat.readFields should print the cause when there's an exception
Ctrl+c should kill currently running query, but not exit the CLI
The class HiveResultSet should implement batch fetching.
Task-cleanup task should be disabled
HIVE-78 Followup: group partitions by tables when do authorizations and there is no partition level privilege
Change Default Alias For Aggregated Columns (_c1)
mapjoin operator should not load hashtable for each new inputfile if the hashtable to be loaded is already there.
recognize transitivity of predicates on join keys
Hive Shell to output number of mappers and number of reducers
Support new annotation @UDFType(stateful = true)
adding comments to Hive Stats JDBC queries
Expand exceptions caught for metastore operations
avoid loading Hive aux jars in CLI remote mode
Create a separate namespace for Hive variables
Performance instruments for client side execution
isEmptyPath() to use ContentSummary cache
Use block-level merge for RCFile if merging intermediate results are needed
Update bitmap indexes for automatic usage
Metastore listener
remove hadoop version check from hive cli shell script
getInputSummary() to call FileSystem.getContentSummary() in parallel
PostHook and PreHook API to add flag to indicate it is pre or post hook plus cache for content summary
Generate single MR job for multi groupby query if hive.multigroupby.singlemr is enabled.
Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
SHOW GRANT grantTime field should be a human-readable timestamp
Reduce memory consumption in preparing MapReduce job
Increase the number of operator counter
No lock for some non-mapred tasks config variable hive.lock.mapred.only.operation added
Optimizer on partition field
Hive's symlink text input format should be able to work with ComineHiveInputFormat
Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
Automatic Indexing with multiple tables
DROP TABLE IF EXISTS should not fail if a view of that name exists
Remove System.exit
Enables HiveServer to accept -hiveconf option
reduce workload generated by JDBCStatsPublisher
Add api to send / receive message to metastore
Add interface classification in Hive.
add exception handling to hive's record reader
Improve error messages emitted during semantic analysis
Improve error messages emitted during task execution
Allow custom serdes to set field comments
Allow optional [inner] on equi-join.
Add actions for alter table and alter partition events for metastore event listeners
reduce name node calls in hive by creating temporary directories
create a new API in Warehouse where the root directory is specified
Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object
ALTER VIEW RENAME
Optimize partial specification metastore functions
add Query text for debugging in lock data
speedup addInputPaths
Make "alter table drop partition" more efficient
Provide metastore upgarde script for HIVE-2215
Ability to add partitions atomically
Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
Show current database in hive prompt
Make CombineHiveInputFormat the default hive.input.format
Dedupe tables' column schemas from partitions in the metastore db
Display a sample of partitions created when Fatal Error occurred due to too many partitioned created
Better error message in CLI on invalid column name
Local mode needs to work well with block sampling
bucketized map join should allow join key as a superset of bucketized columns
Improve error messages for DESCRIBE command
Optimize Hive query startup time for multiple partitions
Add hooks to run when execution fails.
Make Hadoop Job ID available after task finishes executing
Improve RCFile Read Speed
Support automatic rebuilding of indexes when they go stale
Make performance logging configurable.
Improve RCFileCat performance significantly
Warn user that precision is lost when bigint is implicitly cast to double.
Local Mode can be more aggressive if LIMIT optimization is on
RCFileReader Buffer Reuse
Allow RCFile Reader to tolerate corruptions
make hive mapper initialize faster when having tons of input files
The PerfLogger should log the full name of hooks, not just the simple name.
Introduction of client statistics publishers possibility
Add job ID to MapRedStats
Upgrade JavaEWAH to 0.3
move lock retry logic into ZooKeeperHiveLockManager
Need a way to categorize queries in hooks for improved logging
JDBCStatsAggregator DELETE STATEMENT should escape _ and %
Files in Avro-backed Hive tables do not have a ".avro" extension
Group-by query optimization Followup: add flag in conf/hive-default.xml
Add method to PerfLogger to perform cleanup/final steps.
make INNER a non-reserved keyword
HA Support for Metastore Server
Improve support for Constant Object Inspectors
Log more Hadoop task counter values in the MapRedStats class.
Enable ALTER TABLE SET SERDE to work on partition level
Update junit jar in testlibs
Get ConstantObjectInspectors working in UDAFs
Make Constant OIs work with UDTFs.
add a new builtins subproject
Consecutive string literals should be combined into a single string literal.
Use sorted nature of compact indexes
Make metastore log4j configuration file configurable again.
add explain formatted
Use hashing instead of list traversal for IN operator for primitive types
reduce the number map-reduce jobs for union all
Too much debugging info on console if a job failed
avoid referencing /tmp in tests
Setting no_drop on a table should cascade to child partitions
Add caching to json_tuple
Add hook to run in metastore's endFunction which can collect more fb303 counters
Task:
Hive in Maven
Provide Metastore upgrade scripts and default schemas for PostgreSQL
Remaining patch for HIVE-2148
Use the version commons-codec from Hadoop
Upgrade Hive's Thrift dependency to version 0.7.0
Metastore upgrade scripts for schema change introduced in HIVE-2215
Metastore upgrade script and schema DDL for Hive 0.8.0
Make Hive compile against Hadoop 0.23
Add pdk, hbase-handler etc as source dir in eclipse
Update wiki links in README file
Omit incomplete Postgres upgrade scripts from release tarball
Sub-task:
Support JDBC ResultSetMetadata
Bundle Log4j configuration files in Hive JARs
Push down partition pruning to JDO filtering for a subset of partition predicates
batch processing partition pruning process
Backward incompatibility introduced from HIVE-2082 in MetaStoreUtils.getPartSchemaFromTableSchema()
Partition Pruning bug in the case of hive.mapred.mode=nonstrict
Return correct Major / Minor version numbers for Hive Driver
add the HivePreparedStatement implementation based on current HIVE supported data-type
add a TM to Hive logo image
Update project naming and description in Hive wiki
Update project naming and description in Hive website
update project website navigation links
add trademark attributions to Hive homepage
Update project description and wiki link in ivy.xml files
Test:
Test that views with joins work properly
TestLazySimpleSerde fails randomly
create a test to verify that partition pruning works for partitioned views with a union
Wish;
^C breaks out of running query, but not whole CLI

New in Apache Hive 0.7.0 (May 21, 2013)

New Feature:
Authorization infrastructure for Hive
Implement Indexing in Hive
Add reflect() UDF for reflective invocation of Java methods
Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
Authentication Infrastructure for Hive
Hive Variables
Concurrency Model for Hive
add row_sequence UDF
hive command line option -i to run an init file before other SQL commands
add option to let hive automatically run in local mode based on tunable heuristics
bring a table/partition offline
sentences() UDF for natural language tokenization
ngrams() UDAF for estimating top-k n-gram frequencies
Be able to modify a partition's fileformat and file location information.
context_ngrams() UDAF for estimating top-k contextual n-grams
Add json_tuple() UDTF function
Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
Add ANSI SQL correlation aggregate function CORR(X,Y).
Support partition filtering in metastore
Patch to allows scripts in S3 location
Implement "SHOW TABLES {FROM | IN} db_name"
parse_url_tuple: a UDTF version of parse_url
Default values for parameters
Implement GenericUDF str_to_map
Patch to support HAVING clause in Hive
track the joins which are being converted to map-join automatically
Call frequency and duration metrics for HiveMetaStore via jmx
maintain lastAccessTime in the metastore
Make Hive database data center aware
Add a new local mode flag in Task.
Better auto-complete for Hive
Support ALTER DATABASE to change database properties
Implement DROP TABLE/VIEW ... IF EXISTS
Implement DROP {PARTITION, INDEX, TEMPORARY FUNCTION} IF EXISTS
Make the MetaStore filesystem interface pluggable via the hive.metastore.fs.handler.class configuration property
add an option (hive.index.compact.file.ignore.hdfs) to ignore HDFS location stored in index files.
Verbose/echo mode for the Hive CLI
Improvement:
Provide option to export a HEADER
Support for distinct selection on two or more columns
describe extended table/partition output is cryptic
Missing some Jdbc functionality like getTables getColumns and HiveResultSet.get* methods based on column name.
Tapping logs from child processes
support filter pushdown against non-native tables
replace dependencies on HBase deprecated API
use Ivy for fetching HBase dependencies
Make Hive work with Hadoop security
Return value for map, array, and struct needs to return a string
do not update transient_lastDdlTime if the partition is modified by a housekeeping operation
automatically invoke .hiverc init script
add CLI command for executing a SQL script
serializing/deserializing the query plan is useless and expensive
Extend ivy offline mode to cover metastore downloads
Add support to turn off bucketing with ALTER TABLE
Speed up reflection method calls in GenericUDFBridge and GenericUDAFBridge
potentail NullPointerException
hive output file names are unnecessarily large
replace isArray() calls and remove LOG.isInfoEnabled() in Operator.forward()
supply correct information to hooks and lineage for index rebuild
support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
support IDXPROPERTIES on CREATE INDEX
Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
hive starter scripts should load admin/user supplied script for configurability
ability to select across a database
Use ZooKeeper from maven
Add support for JDBC PreparedStatements
Ability to plug custom Semantic Analyzers for Hive Grammar
CompactIndexInputFormat should create split only for files in the index output file.
regression and improvements in handling NULLs in joins
Add alternative search-provider to Hive site
Add ProtocolBuffersStructObjectInspector
ScriptOperator's AutoProgressor can lead to an infinite loop
Use CombineHiveInputFormat for the merge job if hive.merge.mapredfiles=true
convert commonly used udfs to generic udfs
add map joined table to distributed cache
Convert join queries to map-join based on size of table/row
ability to specify parent directory for zookeeper lock manager
Adding consistency check at jobClose() when committing dynamic partitions
Change get_partitions_ps to pass partition filter to database
FetchOperator.getInputFormatFromCache hides causal exception
drop support for pre-0.20 Hadoop versions
remove Hadoop 0.17 specific test reference logs
Optimize Key Comparison in GroupByOperator
Group-by to determine equals of Keys in reverse order
Support for using ALTER to set IDXPROPERTIES
ExecMapper and ExecReducer: reduce function calls to l4j.isInfoEnabled()
Remove Partition Filtering Conditions when Possible
Optimize ColumnarStructObjectInspector.getStructFieldData()
Remove JDBM component from Map Join
test cleanup for Hive-1641
optimize group by hash map memory
Support show locks for a particular table
Add queryid while locking
Update transident_lastDdlTime only if not specified
add more debug information for hive locking
CommonJoinOperator optimize the case of 1:1 join
change Pre/Post Query Hooks to take in 1 parameter: HookContext
Improve documentation for str_to_map() UDF
optimize the code path when there are no outer joins
dumps time at which lock was taken along with the queryid in show locks extended
Compressed the hashtable dump file before put into distributed cache
Clear empty files in Hive
HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
Show the time the local task takes
create a new ZooKeeper instance when retrying lock, and more info for debug
Add a option to run task to check map-join possibility in non-local mode
more debugging for locking
add an option in dynamic partition inserts to throw an error if 0 partitions are created
Reduce unnecessary DFSClient.rename() calls
Include Process ID in the log4j log file name
redo zookeeper hive lock manager
add a factory method for creating a synchronized wrapper for IMetaStoreClient
a mapper should be able to span multiple partitions
Store jobid in ExecDriver
Provide config parameters to control cache object pinning
Allow any type of stats publisher and aggregator in addition to HBase and JDBC
Find a way to disable owner grants
Improve the implementation of the METASTORE_CACHE_PINOBJTYPES config
Have audit logging in the Metastore
"Provide DFS initialization script for Hive
Make Stats gathering more flexible with timeout and atomicity
make a libthrift.jar and libfb303.jar in dist package for backward compatibility
Modify build to run all tests regardless of subproject failures
Hive SymlinkTextInputFormat does not estimate input size correctly
Bug:
"LOAD DATA LOCAL INPATH" fails when the table already contains a file of the same name
NULL is not handled correctly in join
HiveInputFormat.getInputFormatFromCache "swallows" cause exception when throwing IOExcpetion
add progress in join and groupby
Simple UDAFs with more than 1 parameter crash on empty row query
UDF field() doesn't work
Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
skip counter update when RunningJob.getCounters() returns null
FetchOperator(mapjoin) does not work with RCFile
bug in 'set fileformat'
Make Eclipse launch templates auto-adjust to Hive version number changes
Reporting progress in FileSinkOperator works in multiple directory case
hive-site.xml ${user.name} not replaced for local-file derby metastore connection URL
percentile_approx() fails with more than 1 reducer
CTAS should unescape the column name in the select-clause.
plan file should have a high replication factor
.gitignore files being placed in test warehouse directories causing build failure
TestCliDriver -Doverwrite=true does not put the file in the correct directory
fix or disable loadpart_err.q
Index followup: remove sort by clause and fix a bug in collect_set udaf
when generating reentrant INSERT for index rebuild, quote identifiers using backticks
Add cleanup method to HiveHistory class
Monitor the working set of the number of files
HiveCombineInputFormat should not use prefix matching to find the partitionDesc for a given path
hive.mapred.local.mem should only be used in case of local mode job submissions
ql tests no longer work in miniMR mode
Replace globStatus with listStatus inside Hive.java's replaceFiles.
Join filters do not work correctly with outer joins
alter partition should throw exception if the specified partition does not exist.
Unarchiving operation throws NPE
populate inputs and outputs for all statements
Fix TestContribCliDriver test
smb_mapjoin_8.q returns different results in miniMr mode
HBase tests broken
bucketizedhiveinputformat.q fails in minimr mode
referencing an added file by it's name in a transform script does not work in hive local mode
Add conf. property hive.exec.show.job.failure.debug.info to enable/disable displaying link to the task with most failures
cleanup ExecDriver.progress
Hive should not override Hadoop specific system properties
wrong log files in contrib client positive
Add HBase/ZK JARs to Eclipse classpath
udtf_explode.q is an empty file
use SequenceFile rather than TextFile format for hive query results
need to sort hook input/output lists for test result determinism
Hadoop 0.17 ant test broken by HIVE-1523
For a null value in a string column, JDBC driver returns the string "NULL"
Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
UDTF json_tuple should return null row when input is not a valid JSON string
Fix Base64TextInputFormat to be compatible with commons codec 1.4
Patch to fix hashCode method in DoubleWritable class
bug in NO_DROP
CombineHiveInputFormat fails with "cannot find dir for emptyFile"
ExecDriver.addInputPaths() error if partition name contains a comma
Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )
TestContribNegativeCliDriver fails
All TestJdbcDriver test cases fail in Eclipse unless a property is added in run config
join results are displayed wrongly for some complex joins using select *
Fix describe * [extended] column formatting
ql/src/java/org/apache/hadoop/hive/ql/parse/SamplePruner.java is empty
Eclipse build broken
MapJoin throws EOFExeption when the mapjoined table has 0 column selected
multithreading on Context.pathToCS
Create table bug causes the row format property lost when serde is specified.
count(*) returns wrong result when a mapper returns empty results
NPE in MapJoin
In the MapJoinOperator, the code uses tag as alias, which is not always true
ANALYZE TABLE command should check columns in partition spec
incorrect partition pruning ANALYZE TABLE
bug when different partitions are present in different dfs
CREATE TABLE LIKE should not set stats in the new table
Migrating metadata from derby to mysql thrown NullPointerException
duplicated MapRedTask in Multi-table inserts mixed with FileSinkOperator and ReduceSinkOperator
make TestHBaseCliDriver use dynamic ports to avoid conflicts with already-running services
ant clean should delete stats database
hbase_stats.q is failing
Two Bugs for Estimating Row Sizes in GroupByOperator
Fix Eclipse templates (and use Ivy metadata to generate Eclipse library dependencies)
Statistics broken for tables with size in excess of Integer.MAX_VALUE
HIVE 1633 hit for Stage2 jobs with CombineHiveInputFormat
failures in fatal.q in TestNegativeCliDriver
Many important broken links on Hive web page
Mismatched open/commit transaction calls in case of connection retry
Merge files does not work with dynamic partition
pcr.q output is non-deterministic
ROUND(infinity) chokes
Assertation on inputObjInspectors.length in Groupy operator
parallel execution and auto-local mode combine to place plan file in wrong file system
Outdated comments for GenericUDTF.close()
Typo in hive-default.xml
outputs not populated for dynamic partitions at compile time
GenericUDFOr and GenericUDFAnd cannot receive boolean typed object
outputs not correctly populated for alter table
Mapjoin will fail if there are no files associating with the join tables
The merge criteria on dynamic partitons should be per partiton
No Element found exception in BucketMapJoinOptimizer
bug in auto_join25.q
Hive comparison operators are broken for NaN values
spurious rmr failure messages when inserting with dynamic partitioning
show locks should not use getTable()/getPartition
Fix intermittent failures in TestRemoteMetaStore
mappers in group followed by joins may die OOM
Hanging hive client caused by TaskRunner's OutOfMemoryError
Some attributes in the Eclipse template file is deprecated
change hive assumption that local mode mappers/reducers always run in same jvm
bug in MAPJOIN
add more logging to partition pruning
downgrade JDO version
Temporarily disable metastore tests for listPartitionsByFilter()
mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message
Hive's smallint datatype is not supported by the Hive JDBC driver
Hive's float datatype is not supported by the Hive JDBC driver
Revive partition filtering in the Hive MetaStore
Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
test load_overwrite.q fails
Add mechanism for disabling tests with intermittent failures
TestRemoteHiveMetaStore.java accidentally deleted during commit of HIVE-1845
bug introduced by HIVE-1806
Fix 'tar' build target broken in HIVE-1526
fix HBase filter pushdown broken by HIVE-1638
Set the version of Hive trunk to '0.7.0-SNAPSHOT' to avoid confusing it with a release
HBase and Contrib JAR names are missing version numbers
Alter command execution "when HDFS is down" results in holding stale data in MetaStore
create script for the metastore upgrade due to HIVE-78
Can't join HBase tables if one's name is the beginning of the other
FileHandler leak on partial iteration of the resultset.
Double escaping special chars when removing old partitions in rmr
use partition level serde properties
failures in testhbaseclidriver
authorization on database level is broken.
CTAS (create-table-as-select) throws exception when showing results
Fix TestHadoop20SAuthBridge failure on Hudson
GRANT/REVOKE should handle privileges as tokens, not identifiers
alter table rename messes the location
hive.semantic.analyzer.hook cannot have multiple values
Fix test failure in TestContribCliDriver/url_hook.q
dynamic partition insert creating different directories for the same partition during merge
input16_cc.q is failing in testminimrclidriver
fix some outputs and make some tests deterministic
add fully deterministic ORDER BY in test union22.q and input40.q
TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
fix hbase_bulk.m by setting HiveInputFormat
TestHadoop20SAuthBridge failed on current trunk
Mismatched open/commit transaction calls when using get_partition()
Update README.txt and add missing ASF headers
Executing queries using Hive Server is not logging to the log file specified in hive-log4j.properties
Improve naming and README files for MetaStore upgrade scripts
upgrade-0.6.0.mysql.sql script attempts to increase size of PK COLUMNS.TYPE_NAME to 4000
Add datanucleus.identifierFactory property to HiveConf to avoid unintentional MetaStore Schema corruption
Make call to SecurityUtil.getServerPrincipal unambiguous
Sub-task:
table/partition level statistics
Add delegation token support to metastore
a followup patch for changing the description of hive.exec.pre/post.hooks in conf/hive-default.xml
upgrade the database thrift interface to allow parameters key-value pairs
Extend the CREATE DATABASE command with DBPROPERTIES
Add the local flag to all the map red tasks, if the query is running locally.
Task:
Hive should depend on a release version of Thrift
Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
Update Metastore upgrade scripts to handle schema changes introduced in HIVE-1413
Remove CHANGES.txt
Create MetaStore schema upgrade scripts for changes made in HIVE-417
Provide MetaStore schema upgrade scripts for changes made in HIVE-1823
Test:
improve test query performance
JDBM diff in test caused by Hive-1641
merge_dynamic_part's result is not deterministic
change the value of hive.input.format to CombineHiveInputFormat for tests

New in Apache Hive 0.6.0 (May 21, 2013)

New Feature
Add PERCENTILE aggregate function
add database/schema support Hive QL
Hive HBase Integration (umbrella)
row-wise IN would be useful
CommandProcessor should return DriverResponse
add udaf max_n, min_n to contrib
Bucketed Map Join
support views
multi-partition inserts
Create UDFs for XPath expression evaluation
Better Error Messages for Execution Errors
Let user script write out binary data into a table
CombinedHiveInputFormat for hadoop 19
Add UDF to create struct
Add column lineage information to the pre execution hooks
Add metastore API method to get partition by name
bucketing mapjoin where the big table contains more than 1 big partition
enforce bucketing for a table
Add UDF array_contains
ensure sorting properties for a table
sorted merge join
create a new input format where a mapper spans a file
More robust handling of metastore connection failures
Get partitions with a partial specification
Add mathematical UDFs PI, E, degrees, radians, tan, sign, and atan
Thread pool size in Thrift metastore server should be configurable
Add SymlinkTextInputFormat to Hive
Partition name to values conversion conversion method
More generic and efficient merge method
Archiving partitions
Tool to cat rcfiles
histogram() UDAF for a numerical column
Web Interface can ony browse default
Add TCP keepalive option for the metastore server
Alter the number of buckets for a table
Bug
support count(*) and count distinct on multiple columns
getSchema returns invalid column names, getThriftSchema does not return old style string schemas
GenericUDTFExplode() throws NPE when given nulls
desc Table should work
typedbytes does not support nulls
function in a transform with more than 1 argument fails
Predicate push down does not work with UDTF's
NPE when operating HiveCLI in distributed mode
TestContribCliDriver failure in serde_typedbytes.q, serde_typedbytes2.q, and serde_typedbytes3.q
Make it possible for users to recover data when moveTask fails
ColumnarSerde should not be the default Serde when user specified a fileformat using 'stored as'.
Add "-Doffline=true" option to ant
Skew Join does not work in distributed env.
Conditional task does not increase finished job counter when filter job out.
Disable streaming last table if there is a skew key in previous tables.
bug with alter table rename when table has property EXTERNAL=FALSE
create view should expand the query text consistently
Hive CLI shows 'Ended Job=' at the beginning of the job
Assertion in ExecDriver.execute when assertions are enabled in HADOOP_OPTS
"datanucleus" typos in conf/hive-default.xml
Use TreeMap instead of Property to make explain extended deterministic
Job counter error if "hive.merge.mapfiles" equals true
'create if not exists' fails for a table name with 'select' in it
Expression Not In Group By Key error is sometimes masked
Fix RCFile resource leak when opening a non-RCFile
Increase ObjectInspector[] length on demand
Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
typedbytes: writing to stderr kills the mapper
RowContainer should flush out dummy rows when the table desc is null
ScriptOperator AutoProgressor does not set the interval
CombineHiveInputFormat does not work for compressed text files
hints cannot be passed to transform statements
Task breaking bug when breaking after a filter operator
date_sub() function returns wrong date because of daylight saving time difference
joins between HBase tables and other tables (whether HBase or not) are broken
set merge files to files when bucketing/sorting is being enforced
ql.metadata.Hive#close() should check for null metaStoreClient
Cannot start metastore thrift server on a specific port
Case sensitiveness of type information specified when using custom reducer causes type mismatch
UDF_Percentile NullPointerException
bug in sort merge join if the big table does not have any row
TestHBaseCliDriver hangs
Select query with specific projection(s) fails if the local file system directory for ${hive.user.scratchdir} does not exist.
problem in combinehiveinputformat with nested directories
Bucketing column names in create table should be case-insensitive
error/info message being emitted on standard output
sort merge join does not work with bucketizedhiveinputformat
Fix UDAFPercentile ndexOutOfBoundsException
HIVE_AUX_JARS_PATH interferes with startup of Hive Web Interface
unit test symlink_text_input_format.q needs ORDER BY for determinism
= throws NPE
bug is use of hadoop supports splittable
hive trunk does not compile with hadoop 0.17 any more
bucketed sort merge join breaks after dynamic partition insert
CombineHiveInputFormat throws exception when partition name contains special characters to URI
NPE with lineage in a query of union alls on joins.
bugs with temp directories, trailing blank fields in HBase bulk load
Cached FileSystem can lead to persistant IOExceptions
leading dash in partition name is not handled properly
dynamic partition insert should throw an exception if the number of target table columns + dynamic partition columns does not equal to the number of select columns
RowContainer uses hard-coded '/tmp/' path for temporary files
Group by partition column returns wrong results
fatal error check omitted for reducer-side operators
select * does not work if different partitions contain different formats
Fix bin/ext/jar.sh to work with hadoop 0.20 and above
Filter Operator Column Pruning should preserve the column order
TypedBytesSerDe fails to create table with multiple columns.
hive.query.id is not unique
rcfilecat should use '\t' to separate columns and print '\r\n' at the end of each row.
load_dyn_part*.q tests need ORDER BY for determinism
partition level properties honored if it exists
Increase the maximum length of various metastore fields, and remove TYPE_NAME from COLUMNS primary key
Bug in SMBJoinOperator which may causes a final part of the results in some cases.
inputFileFormat error if the merge job takes a different input file format than the default output file format
remove blank in rcfilecat
Missing connection pool plugin in Eclipse classpath
getPartitionDescFromPath() in CombineHiveInputFormat should handle matching by path
combinehiveinputformat does not work if files are of different types
Reporting progress to JT during closing files in FileSinkOperator
Add hadoop-*-tools.jar to Eclipse classpath
File format information is retrieved from first partition
DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH
CombineHiveInputFormat bug on tablesample
Archived partitions throw error with queries calling getContentSummary
column pruning not working with lateral view
problem with sequence and rcfiles are mixed for null partitions
problem with sequence and rcfiles are mixed for null partitions
hive.task.progress should be added to conf/hive-default.xml
ALTER TABLE ADD PARTITION fails with a remote Thrift metastore
Upgraded naming scheme causes JDO exceptions
bug in 'set fileformat'
insert overwrite and CTAS fail in hive local mode
lateral view does not work with column pruning
FileSinkOperator should remove duplicated files from the same task based on file sizes
parallel execution failed if mapred.job.name is set
Typo of hive.merge.size.smallfiles.avgsize prevents change of value
hive --service jar looks for hadoop version but was not defined
Web Interface JSP needs Refactoring for removed meta store methods
ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
Migration scripts should increase size of PARAM_VALUE in PARTITION_PARAMS
Improvement
provide option to run hive in local mode
handle skewed keys for a join in a separate job
Incorporate CheckStyle into Hive's build.xml
Merge tasks in GenMRUnion1
CREATE VIEW followup: add a "table type" enum attribute in metastore's MTable, and also null out irrelevant attributes for MTable instances which describe views
CREATE VIEW followup: find and document current expected version of thrift, and regenerate code to match
Add a "skew join map join size" variable to control the input size of skew join's following map join job.
make number of concurrent tasks configurable
QueryPlan to be independent from BaseSemanticAnalyzer
Structured temporary directories
add counters to show that skew join triggered
Make QueryPlan serializable
Add hive.merge.size.per.task to HiveConf
Make all Tasks and Works serializable
In ivy offline mode, don't delete downloaded jars
Make ql/metadata/Table and Partition serializable
Let max/min handle complex types like struct
add type-checking setters for HiveConf class to match existing getters
CREATE VIEW followup: support ALTER TABLE SET TBLPROPERTIES on views
Add comment to explain why we check for dir first in add_partitions().
Add metastore API method to drop partition / append partition by name
drop_partition_by_name() should use drop_partition_common()
Configure build to download Hadoop tarballs from Facebook mirror instead of Apache
When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
Explicitly say "Hive Internal Error" to ease debugging
Show the row with error in mapper/reducer
accept TBLPROPERTIES on CREATE TABLE/VIEW
allow HBase key column to be anywhere in Hive table
add pre-drops in bucketmapjoin*.q
add backward-compatibility constructor to HiveMetaStoreClient
mapjoin followed by another mapjoin should be performed in a single query
from_unixtime should implment a overloading function to accept only bigint type
optimize bucketing
facilitate HBase bulk loads from Hive
CLI set and set -v commands should dump properties in alphabetical order
error message in Hive.checkPaths dumps Java array address instead of path string
support: alter table touch partition
cleanup the jobscratchdir
Increase the memory limit for CLI client
make mapred.input.dir.recursive work for select *
for ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'), change TBL_TYPE attribute from MANAGED_TABLE to EXTERNAL_TABLE
DataNucleus should use connection pooling
Moving inputFileChanged() from ExecMapper to where it is needed
Do not pull counters of non initialized jobs
Hive should use NullOutputFormat for hadoop jobs
CombineHiveInputSplit should initialize the inputFileFormat once for a single split
New algorithm for variance() UDAF
allow HBase WAL to be disabled
Add PERCENTILE_APPROX which works with double data type
Make Hive build work with Ivy versions < 2.1.0
set abort in ExecMapper when Hive's record reader got an IOException
Make the compile target depend on thrift.home
Task
Automated source code cleanup
Cleanup Class names
Add .gitignore file
Suppress Checkstyle warnings for generated files
Replace instances of StringBuffer/Vector with StringBuilder/ArrayList
Checkstyle fixes
Use Anakia for version controlled documentation
build references IVY_HOME incorrectly
Update Eclipse project configuration to match Checkstyle
Eclipse launchtemplate changes to enable debugging
fix Hive logo img tag to avoid stretching
Provide metastore schema migration scripts (0.5 -> 0.6)
Provide Postgres metastore schema migration scripts (0.5 -> 0.6)
Include metastore upgrade scripts in release tarball
Update README file for 0.6.0 release
Satisfy ASF release management requirements
Sub-task
checking VOID type for NULL in LazyBinarySerde
Test
NPE when running TestJdbcDriver/TestHiveServer
test HBase input format plus CombinedHiveInputFormat
temporarily disable HBase test execution
Unit test should be shim-aware